subject:"Negative array indicies and slice\(\)"

Fwd: Re: Negative array indicies and slice()

2012-11-03 Thread Andrew Robinson

Forwarded to python list:

 Original Message 
Subject:Re: Negative array indicies and slice()
Date:   Sat, 03 Nov 2012 15:32:04 -0700
From:   Andrew Robinson
Reply-To:   andr...@r3dsolutions.com
To: Ian Kelly 

On 11/01/2012 05:32 PM, Ian Kelly wrote:

 On Thu, Nov 1, 2012 at 4:25 PM, Andrew Robinson

 The bottom line is:  __getitem__ must always *PASS* len( seq ) to slice()
 each *time* the slice() object is-used.  Since this is the case, it would
 have been better to have list, itself, have a default member which takes the
 raw slice indicies and does the conversion itself.  The size would not need
 to be duplicated or passed -- memory savings,   speed savings...

 And then tuple would need to duplicate the same code.  As would deque.
   And str.  And numpy.array, and anything else that can be sliced,
 including custom sequence classes.

I don't think that's true.  A generic function can be shared among
different objects without being embedded in an external index data
structure to boot!

If *self* were passed to an index conversion function (as would
naturally happen anyway if it were a method), then the method could take
len( self ) without knowing what the object is;
Should the object be sliceable -- the len() will definitely return the
required piece of information.

 Numpy arrays are very different internally from lists.

Of course!  (Although, lists do allow nested lists.)

 I'm not understanding what this is meant to demonstrate.  Is MyClass
 a find-replace error of ThirdParty?  Why do you have __getitem__
 returning slice objects instead of items or subsequences?  What does
 this example have to do with numpy?

Here's a very cleaned up example file, cut and pastable:
#!/bin/env python
# File: sliceIt.py  --- a pre PEP357 hypothesis test skeleton

class Float16():

Numpy creates a float type, with very limited precision -- float16
Rather than force you to install np for this test, I'm just making a
faux object.  normally we'd just import np

def __init__(self,value): self.value = value
def AltPEP357Solution(self):
 This is doing exactly what __index__ would be doing. 
return None if self.value is None else int( self.value )

class ThirdParty( list ):

A simple class to implement a list wrapper, having all the
properties of
a normal list -- but explicitly showing portions of the interface.

def __init__(self, aList): self.aList = aList

def __getitem__(self, aSlice):
print( __getitems__, aSlice )
temp=[]
edges = aSlice.indices( len( self.aList ) ) # *unavoidable* call
for i in range( *edges ): temp.append( self.aList[ i ] )
return temp

def Inject_FloatSliceFilter( theClass ):

This is a courtesy function to allow injecting (duck punching)
a float index filter into a user object.

def Filter_FloatSlice( self, aSlice ):

# Single index retrieval filter
try: start=aSlice.AltPEP357Solution()
except AttributeError: pass
else: return self.aList[ start ]

# slice retrieval filter
try: start=aSlice.start.AltPEP357Solution()
except AttributeError: start=aSlice.start
try: stop=aSlice.stop.AltPEP357Solution()
except AttributeError: stop=aSlice.stop
try: step=aSlice.step.AltPEP357Solution()
except AttributeError: step=aSlice.step
print( Filter To,start,stop,step )
return self.super_FloatSlice__getitem__( slice(start,stop,step) )

theClass.super_FloatSlice__getitem__ = theClass.__getitem__
theClass.__getitem__ = Filter_FloatSlice

# EOF: sliceIt.py

Example run:

 from sliceIt import *
 test = ThirdParty( [1,2,3,4,5,6,7,8,9] )
 test[0:6:3]

('__getitems__', slice(0, 6, 3))
[1, 4]

 f16=Float16(8.3)
 test[0:f16:2]

('__getitems__', slice(0,sliceIt.Float16 instance at 0xb74baaac, 2))
Traceback (most recent call last):
  File stdin, line 1, inmodule
  File sliceIt.py, line 26, in __getitem__
edges = aSlice.indices( len( self.aList ) )  # This is an
*unavoidable* call
TypeError: object cannot be interpreted as an index

 Inject_FloatSliceFilter( ThirdParty )
 test[0:f16:2]

('Filter To', 0, 8, 2)
('__getitems__', slice(0, 8, 2))
[1, 3, 5, 7]

 test[f16]

9

 We could also require the user to explicitly declare when they're
 performing arithmetic on variables that might not be floats. Then we
 can turn off run-time type checking unless the user explicitly
 requests it, all in the name of micro-optimization and explicitness.

:) None of those would help micro-optimization that I can see.

 Seriously, whether x is usable as a sequence index is a property of x,
 not a property of the sequence.

Yes, but the *LENGTH* of the sequence is a function of the *sequence*.

 Users shouldn't need to pick and choose *which* particular sequence
 index types their custom sequences are willing

Re: Negative array indicies and slice()

2012-11-02 Thread Andrew Robinson


Hi Ian,

I apologize for trying your patience with the badly written code 
example.  All objects were meant to be ThirdParty(), the demo was only 
to show how a slice() filter could have been applied for the reasons 
PEP357 made index() to exist.
eg: because numpy items passed to __getitems__ via slice syntax [::] 
were illegal values.
PEP 357 is the one who specifically mentioned Numpy types -- which is 
the only reason I used the name in the example;  I could have just as 
well used a string.


I am fully aware of what numpy does -- I have used it; modified the 
fortran interfaces underneath, etc.


The index() method, however, affects *all* list objects in Python, not 
just Numpy's -- correct?


I'll write a working piece of code tomorrow to demonstrate the filter 
very clearly rather than a skeleton, and test it before posting.


--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-11-02 Thread Robert Kern


On 11/2/12 8:57 AM, Andrew Robinson wrote:

Hi Ian,

I apologize for trying your patience with the badly written code example.  All
objects were meant to be ThirdParty(), the demo was only to show how a slice()
filter could have been applied for the reasons PEP357 made index() to exist.
eg: because numpy items passed to __getitems__ via slice syntax [::] were
illegal values.
PEP 357 is the one who specifically mentioned Numpy types -- which is the only
reason I used the name in the example;  I could have just as well used a string.

I am fully aware of what numpy does -- I have used it; modified the fortran
interfaces underneath, etc.

The index() method, however, affects *all* list objects in Python, not just
Numpy's -- correct?


Please forget that PEP 357 mentions slices at all. The motivation for the 
__index__() method (not index()) goes far beyond slices. I'm not really sure why 
they are given such a prominent place in the PEP. Let me try to lay out the 
motivation more clearly.


numpy has objects that represent integers but cannot be subclasses of the Python 
int or long objects because their internal representations are different. These 
are the width-specific types: uint8, int16, int64, etc. Before __index__() was 
introduced, all indexing operations in the builtin Python sequence types 
strictly checked for int or long objects and rejected other objects. We wanted 
to provide a generic method that third party types could implement to say, Yes, 
I really am an integer, here is my value in a canonical representation you can 
understand. We could not use __int__() for this purpose because it has 
additional semantics, namely conversion from not-integers to integers. This is 
why floats are mentioned; they do not generally represent integers but they do 
define an __int__() method for their conversion to ints via the floor() 
function. Generally, they should be rejected as indices. With the __index__() 
method, we have a solution: int16 and the rest get __index__() methods and float 
doesn't.


This is used where an integer index or offset is needed, not just in slices. 
List indices, file.seek(), mmap.mmap(), etc. The change to use PyIndex_Check() 
instead of PyInt_Check() was not very difficult or extensive. Even if you were 
to change the slicing API for your other reasons, __index__() would still be needed.


--
Robert Kern

I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth.
  -- Umberto Eco

--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-11-01 Thread Robert Kern


On 10/31/12 8:16 PM, Andrew Robinson wrote:

On 10/31/2012 02:20 PM, Ian Kelly wrote:

On Wed, Oct 31, 2012 at 7:42 AM, Andrew Robinson wrote:

Then; I'd note:  The non-goofy purpose of slice is to hold three data
values;  They are either numbers or None.  These *normally* encountered
values can't create a memory loop.
So, FOR AS LONG, as the object representing slice does not contain an
explicit GC pair; I move that we mandate (yes, in the current python
implementation, even as a *fix*) that its named members may not be
assigned any objects other than None or numbers

eg: Lists would be forbidden

Since functions, and subclasses, can be test evaluated by int(
the_thing_to_try ) and *[] can too,
generality need not be lost for generating nothing or numbers.


PEP 357 requires that anything implementing the __index__ special method be
allowed for slicing sequences (and also that __index__ be used for the
conversion).  For the most part, that includes ints and numpy integer types,
but other code could be doing esoteric things with it.


I missed something... (but then that's why we're still talking about it...)

Reading the PEP, it notes that *only* integers (or longs) are permitted in slice
syntax.
(Overlooking None, of course... which is strange...)

The PEP gives the only exceptions as objects with method __index__.

Automatically, then, an empty list is forbidden (in slice syntax).
However,  What you did, was circumvent the PEP by passing an empty list directly
to slice(), and avoiding running it through slice syntax processing.


Why do you think it is forbidden by the syntax?

[~]
|1 class A(object):
.. def __getitem__(self, key):
.. return key
..

[~]
|2 a = A()

[~]
|3 a[[]:]
slice([], None, None)


The PEP is a little unclear and refers to a state of the Python interpreter that 
no longer exists. At the time, I think __getslice__() was still not deprecated, 
and it did require ints (or after the PEP, __index__able objects). 
__getslice__() is now deprecated in favor of __getitem__() where you can 
interpret slice objects with arbitrary objects to your heart's content. 
Arbitrary objects *are* definitely allowed by the slice syntax (how could the 
syntax know what is an int and what is another kind of object?). Most objects 
that interpret slices, especially the builtin sequence types, do require 
__index__able objects (or None).



So, what's the psychology behind allowing slice() to hold objects which are not
converted to ints/longs in the first place?


In numpy, we (ab)use this freedom for some convenient notation in special 
objects. We have a couple of grid-making convenience objects:


[~]
|5 numpy.mgrid[1.5:2.5:0.1]
array([ 1.5,  1.6,  1.7,  1.8,  1.9,  2. ,  2.1,  2.2,  2.3,  2.4])


This syntax uses the start:stop:step notation to make a float range. If we use 
an imaginary integer in the step slot, mgrid will interpret it as the number 
of items requested instead of the step.


[~]
|6 numpy.mgrid[1.5:2.5:11j]
array([ 1.5,  1.6,  1.7,  1.8,  1.9,  2. ,  2.1,  2.2,  2.3,  2.4,  2.5])

--
Robert Kern

I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth.
  -- Umberto Eco

--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-11-01 Thread Ethan Furman


Andrew Robinson wrote:

  On 10/31/2012 02:20 PM, Ian Kelly wrote:

On Wed, Oct 31, 2012 at 7:42 AM, Andrew Robinson  wrote:

Then; I'd note:  The non-goofy purpose of slice is to hold three
data values;  They are either numbers or None.  These *normally*
encountered values can't create a memory loop.
So, FOR AS LONG, as the object representing slice does not contain
an explicit GC pair; I move that we mandate (yes, in the current
python implementation, even as a *fix*) that its named members may
not be assigned any objects other than None or numbers

eg: Lists would be forbidden

Since functions, and subclasses, can be test evaluated by int(
the_thing_to_try ) and *[] can too,
generality need not be lost for generating nothing or numbers.



PEP 357 requires that anything implementing the __index__ special 
method be allowed for slicing sequences (and also that __index__ be 
used for the conversion).  For the most part, that includes ints and 
numpy integer types, but other code could be doing esoteric things 
with it.


I missed something... (but then that's why we're still talking about it...)

Reading the PEP, it notes that *only* integers (or longs) are permitted 
in slice syntax.


Keep in mind that PEPs represent Python /at that time/ -- as Python
moves forward, PEPs are not updated (this has gotten me a couple times).


The change would be backward-incompatible in any case, since there is 
certainly code out there that uses non-numeric slices -- one example 
has already been given in this thread.


Hmmm.

Now, I'm thinking -- The purpose of index(), specifically, is to notify 
when something which is not an integer may be used as an index;  You've 
helpfully noted that index() also *converts* those objects into numbers.


Ethan Fullman mentioned that he used the names of fields, instead of 
having to remember the _offsets_; Which means that his values _do 
convert_ to offset numbers


Furman, actually.  :)

And my values do *not* convert to indices (at least, not automatically).
My __getitem__ code looks like:

elif isinstance(item, slice):
sequence = []
if isinstance(item.start, (str, unicode)) \
or isinstance(item.stop, (str, unicode)):
field_names = dbf.field_names(self)
start, stop, step = item.start, item.stop, item.step
if start not in field_names or stop not in field_names:
raise MissingFieldError(
Either %r or %r (or both) are not valid field names
% (start, stop))
if step is not None and not isinstance(step, (int, long)):
raise DbfError(
step value must be an int or long, not %r
% type(step))
start = field_names.index(start)
stop = field_names.index(stop) + 1
item = slice(start, stop, step)
for index in self._meta.fields[item]:
sequence.append(self[index])
return sequence

In other words, the slice contains the strings, and my code calculates
the offsets -- Python doesn't do it for me.



His example was actually given in slice syntax notation [::].
Hence, his objects must have an index() method, correct?.


Nope.

~Ethan~
--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-11-01 Thread Chris Angelico

On Fri, Nov 2, 2012 at 1:12 AM, Ethan Furman et...@stoneleaf.us wrote:
 In other words, the slice contains the strings, and my code calculates
 the offsets -- Python doesn't do it for me.

That's correct, but you're still translating those strings into
numeric indices. You can slice a database record based on column names
(though personally I would recommend against it - creates too much
dependence on column order, which I prefer to treat as
non-significant), but you can't, for instance, slice a dictionary by
keys:

foo = {asdf:123,qwer:234,zxcv:345,1234:456}
foo[qwer:1234] # What should this return?

I suppose conceptually you could slice any iterable by discarding till
you match the start, then yielding till you match the stop, then
returning (it'd function like itertools.islice but using non-numeric
indices - somehow). But it still depends on there being a dependable
order.

(Incidentally, isinstance(X, (str, unicode)) can become isinstance(X,
basestring) - they both inherit from that.)

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-11-01 Thread Ethan Furman


Chris Angelico wrote:

On Fri, Nov 2, 2012 at 1:12 AM, Ethan Furman et...@stoneleaf.us wrote:

In other words, the slice contains the strings, and my code calculates
the offsets -- Python doesn't do it for me.


That's correct, but you're still translating those strings into
numeric indices.


True, but the point is that the /slice/ contains a data type that is
neither a number, nor directly translatable into a number (that is, no
__index__ method), and my code would cease to function should that
change to slices be made.

~Ethan~
--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-11-01 Thread Andrew Robinson


On 11/01/2012 07:12 AM, Ethan Furman wrote:

Andrew Robinson wrote:

  On 10/31/2012 02:20 PM, Ian Kelly wrote:

On Wed, Oct 31, 2012 at 7:42 AM, Andrew Robinson  wrote:

Then; I'd note:  The non-goofy purpose of slice is to hold three
data values;  They are either numbers or None.  These *normally*
encountered values can't create a memory loop.
So, FOR AS LONG, as the object representing slice does not contain
an explicit GC pair; snip


A little review...
The premise of my statement here, is that Tim Peter's closed the Bug report;

http://bugs.python.org/issue1501180
With the *reason* being that using GC was *goofy* on account of what slice() was intended 
to hold, None and a number.  So, My first attempt at bug fix was simply to take Tim 
Peter's at his word... since we all assume he *isn't* a Bloody Idiot.  Hey 
isn't that a swear-word somewhere in the world?  Its not where I live, but I seem to 
recall... oh, well... whatever.

I missed something... (but then that's why we're still talking about 
it...)


Reading the PEP, it notes that *only* integers (or longs) are 
permitted in slice syntax.


Keep in mind that PEPs represent Python /at that time/ -- as Python
moves forward, PEPs are not updated (this has gotten me a couple times).
And, since I am reading them in the order written (but in 3.0) trying to 
get the whole of Python into my mind on the journey to prep for porting 
it into a tiny chip -- I'm frustrated by not being finished yet...



Furman, actually.  :)

:-!



And my values do *not* convert to indices (at least, not automatically).
Ahhh (Rhetorical  sarcastic) I was wondering how you added index() 
method to strings, not access it, and still be following the special PEP 
we are talking about,when you gave that example using unwrapped strings.


--

H was that PEP the active state of Python, when Tim rejected the 
bug report?  eg: have we moved on into a place where the bug report 
ought to be re-issued since that PEP is now *effectively* passe, and Tim 
could thus be vindicated from being a b... Idiot?  (Or has he been 
given the 1st place, Python Twit award -- and his *man* the bug list 
been stripped?)



In other words, the slice contains the strings, and my code calculates
the offsets -- Python doesn't do it for me.

~Ethan~


I see, so the problem is that PEP wants you to implement the index(), 
but that is going to cause you to subclass string, and add a wrapper 
interface every time you need to index something.
eg: doing something llke ---   mydbclass[ MyString( 'fromColumn' ) : 
MyString( 'toColum' ) ] and the code becomes a candy machine interface 
issue (Chapter 5, Writing Solid Code).


My favorite line there uses no swearing  If they had just taken an 
extra *30* seconds thinking about their design, they could have saved 
me, and I'm sure countless others, from getting something they didn't 
want.   I laugh, if they didn't get it already -- an extra *30* seconds 
is WY to optimistic.  Try minutes at least, will a policeman glaring 
over their shoulder.


But anyhow --- The problem lies in *when* the conversion to an integer 
is to take place, not so much if it is going to happen.  Your indexes, 
no matter how disguised, eventually will become numbers; and you have a 
way that minimizes coding cruft (The very reason I started the thread, 
actually... subclassing trivially to fix candy machine interfaces leads 
to perpetual code increases -- In cPython source-code, realloc 
wrappers and malloc wrappers are found  I've seen these wrappers 
*re*-invented in nearly every C program I've every looked at! Talk about 
MAN-hours, wasted space, and cruft.)


So; is this a reasonable summary of salient features (status quo) ?

 * Enforcing strict numerical indexes (in the slice [::] operator)
   causes much psychological angst when attempting to write clear code
   without lots of wrapper cruft.
 * Pep 357 merely added cruft with index(), but really solved nothing. 
   Everything index() does could be implemented in __getitem__ and

   usually is.
 * slice().xxxs are merely a container for *whatever* was passed to [::]
 * slice() is
 * slice is also a full blown object, which implements a trivial method
   to dump the contents of itself to a tuple.
 * presently slice() allows memory leaks through GC loops.
 * Slice(), even though an object with a constructor, does no error
   checking to deny construction of memory leaks.

If people would take 30 seconds to think about this the more details 
added -- the more comprehensive can be my understanding -- and perhaps a 
consensus reached about the problem.

These are a list of relevant options, without respect to feasability.

 * Don't bother to fix the bug; allow Python to crash with a subtle bug
   that often take weeks to track down by the very small minority doing
   strange things (Equivalent to the monkey patch syndrome of
   D'Aprano; BTW: The longer the bug is left unfixed, the

Re: Negative array indicies and slice()

2012-11-01 Thread Chris Angelico

On Thu, Nov 1, 2012 at 10:32 PM, Andrew Robinson
andr...@r3dsolutions.com wrote:
 presently slice() allows memory leaks through GC loops.

Forgive me if I've missed something here, but isn't it only possible
to make a refloop by decidedly abnormal behaviour? Stuff like:

a=[]; a.append(slice(a))

Seriously, who does this? First you have to have a reference to a
container in place of an index, and then you have to retain the slice
object inside that same container as well. Neither operation is normal
use of a slice. Where is the problem?

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-11-01 Thread Ian Kelly

On Thu, Nov 1, 2012 at 5:32 AM, Andrew Robinson
andr...@r3dsolutions.com wrote:
 H was that PEP the active state of Python, when Tim rejected the bug 
 report?

Yes. The PEP was accepted and committed in March 2006 for release in
Python 2.5.  The bug report is from June 2006 has a version
classification of Python 2.5, although 2.5 was not actually released
until September 2006.

 Pep 357 merely added cruft with index(), but really solved nothing.  
 Everything index() does could be implemented in __getitem__ and usually is.

No.  There is a significant difference between implementing this on
the container versus implementing it on the indexes.  Ethan
implemented his string-based slicing on the container, because the
behavior he wanted was specific to the container type, not the index
type.  Custom index types like numpy integers on the other hand
implement __index__ on the index type, because they apply to all
sequences, not specific containers.  This must be separate from
standard int conversion, because standard int conversion is too
general for indexing.

 slice is also a full blown object, which implements a trivial method to dump 
 the contents of itself to a tuple.

slice.indices() does not trivially dump its contents as given.  It
takes a sequence length and adjusts its indices to that length.  The C
implementation of this is around 60 lines of code.

 Don't bother to fix the bug; allow Python to crash with a subtle bug that 
 often take weeks to track down by the very small minority doing strange 
 things (Equivalent to the monkey patch syndrome of D'Aprano; BTW: The 
 longer the bug is left unfixed, the more people will invent uses for it )

It's been 6 years already.  AFAIK nobody has invented any uses that
are actually at risk of invoking the GC bug.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-11-01 Thread Ethan Furman


Ian Kelly wrote:

On Thu, Nov 1, 2012 at 5:32 AM, Andrew Robinson wrote:

Don't bother to fix the bug; allow Python to crash with a subtle bug that often take weeks to track 
down by the very small minority doing strange things (Equivalent to the monkey patch 
syndrome of D'Aprano; BTW: The longer the bug is left unfixed, the more people will invent 
uses for it )


It's been 6 years already.  AFAIK nobody has invented any uses that
are actually at risk of invoking the GC bug.


The bug is not that slice allows non-numbers, but that slice objects aren't tracked by gc; I'm not 
seeing an issue with not fixing the bug.


~Ethan~
--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-11-01 Thread Andrew Robinson


On 11/01/2012 12:07 PM, Ian Kelly wrote:

On Thu, Nov 1, 2012 at 5:32 AM, Andrew Robinson
andr...@r3dsolutions.com  wrote:

H was that PEP the active state of Python, when Tim rejected the bug 
report?

Yes. The PEP was accepted and committed in March 2006 for release in
Python 2.5.  The bug report is from June 2006 has a version
classification of Python 2.5, although 2.5 was not actually released
until September 2006.

That explain's Peter's remark.  Thank you.  He looks *much* smarter now.




Pep 357 merely added cruft with index(), but really solved nothing.  Everything 
index() does could be implemented in __getitem__ and usually is.

No.  There is a significant difference between implementing this on
the container versus implementing it on the indexes.  Ethan
implemented his string-based slicing on the container, because the
behavior he wanted was specific to the container type, not the index
type.  Custom index types like numpy integers on the other hand
implement __index__ on the index type, because they apply to all
sequences, not specific containers.


Hmmm...
D'Aprano didn't like the monkey patch;and sub-classing was his fix-all.

Part of my summary is based on that conversation with him,and you 
touched on one of the unfinished  points; I responded to him that I 
thought __getitem__ was under-developed.   The object slice() has no 
knowledge of the size of the sequence; nor can it get that size on it's 
own, but must passively wait for it to be given to it.


The bottom line is:  __getitem__ must always *PASS* len( seq ) to 
slice() each *time* the slice() object is-used.  Since this is the case, 
it would have been better to have list, itself, have a default member 
which takes the raw slice indicies and does the conversion itself.  The 
size would not need to be duplicated or passed -- memory savings,  
speed savings...


I'm just clay pidgeoning an idea out here
Let's apply D'Aprano 's logic to numpy; Numpy could just have subclassed 
*list*; so let's ignore pure python as a reason to do anything on the 
behalf on Numpy:


Then, lets' consider all thrid party classes;  These are where 
subclassing becomes a pain -- BUT: I think those could all have been 
injected.


 class ThirdParty( list ):  # Pretend this is someone else's...
... def __init__(self): return
... def __getitem__(self,aSlice): return aSlice
...

We know it will default work like this:
 a=ThirdParty()
 a[1:2]
slice(1, 2, None)

# So, here's an injection...
 ThirdParty.superOnlyOfNumpy__getitem__ = MyClass.__getitem__
 ThirdParty.__getitem__ = lambda self,aSlice: ( 1, 3, 
self.superOnlyOfNumpy__getitem__(aSlice ).step )

 a[5:6]
(1, 3, None)

Numpy could have exported a (workable) function that would modify other 
list functions to affect ONLY numpy data types (eg: a filter).  This 
allows user's creating their own classes to inject them with Numpy's 
filter only when they desire;


Recall Tim Peter's explicit is better than implicit Zen?

Most importantly normal programs not using Numpy wouldn't have had to 
carry around an extra API check for index() *every* single time the 
heavily used [::] happened.  Memory  speed both.


It's also a monkey patch, in that index() allows *conflicting* 
assumptions in violation of the unexpected monkey patch interaction worry.


eg: Numpy *CAN* release an index() function on their floats -- at which 
point a basic no touch class (list itself) will now accept float as an 
index in direct contradiction of PEP 357's comment on floats... see?


My point isn't that this particular implementation I have shown is the 
best (or even really safe, I'd have to think about that for a while).  
Go ahead and shoot it down...


My point is that, the methods found in slice(), and index() now have 
moved all the code regarding a sequence *out* of the object which has 
information on that sequence.  It smacks of legacy.


The Python parser takes values from many other syntactical constructions 
and passes them directly to their respective objects -- but in the case 
of list(), we have a complicated relationship; and not for any reason 
that can't be handled in a simpler way.


Don't consider the present API legacy for a moment, I'm asking 
hypothetical design questions:


How many users actually keep slice() around from every instance of [::] 
they use?
If it is rare, why create the slice() object in the first place and 
constantly be allocating and de-allocating memory, twice over? (once for 
the original, and once for the repetitive method which computes dynamic 
values?)  Would a single mutable have less overhead, since it is 
destroyed anyway?


--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-11-01 Thread Ian Kelly

On Thu, Nov 1, 2012 at 4:25 PM, Andrew Robinson
andr...@r3dsolutions.com wrote:
 The bottom line is:  __getitem__ must always *PASS* len( seq ) to slice()
 each *time* the slice() object is-used.  Since this is the case, it would
 have been better to have list, itself, have a default member which takes the
 raw slice indicies and does the conversion itself.  The size would not need
 to be duplicated or passed -- memory savings,  speed savings...

And then tuple would need to duplicate the same code.  As would deque.
 And str.  And numpy.array, and anything else that can be sliced,
including custom sequence classes.

 Let's apply D'Aprano 's logic to numpy; Numpy could just have subclassed
 *list*;

Numpy arrays are very different internally from lists.  They are
basically fancy wrappers of C arrays, whereas lists are a higher-level
abstraction.  They allow for multiple dimensions, which lists do not.
Slices of numpy arrays produce views, whereas slices of lists produce
brand new lists.  And they certainly do not obey the Liskov
Substitution Principle with respect to lists.

 class ThirdParty( list ):  # Pretend this is someone else's...
 ... def __init__(self): return
 ... def __getitem__(self,aSlice): return aSlice
 ...

 We know it will default work like this:
 a=ThirdParty()
 a[1:2]
 slice(1, 2, None)

 # So, here's an injection...
 ThirdParty.superOnlyOfNumpy__getitem__ = MyClass.__getitem__
 ThirdParty.__getitem__ = lambda self,aSlice: ( 1, 3,
 self.superOnlyOfNumpy__getitem__(aSlice ).step )
 a[5:6]
 (1, 3, None)

I'm not understanding what this is meant to demonstrate.  Is MyClass
a find-replace error of ThirdParty?  Why do you have __getitem__
returning slice objects instead of items or subsequences?  What does
this example have to do with numpy?

 Numpy could have exported a (workable) function that would modify other list
 functions to affect ONLY numpy data types (eg: a filter).  This allows
 user's creating their own classes to inject them with Numpy's filter only
 when they desire;

 Recall Tim Peter's explicit is better than implicit Zen?

We could also require the user to explicitly declare when they're
performing arithmetic on variables that might not be floats.  Then we
can turn off run-time type checking unless the user explicitly
requests it, all in the name of micro-optimization and explicitness.

Seriously, whether x is usable as a sequence index is a property of x,
not a property of the sequence.  Users shouldn't need to pick and
choose *which* particular sequence index types their custom sequences
are willing to accept.  They should even be able to accept sequence
index types that haven't been written yet.

 Most importantly normal programs not using Numpy wouldn't have had to carry
 around an extra API check for index() *every* single time the heavily used
 [::] happened.  Memory  speed both.

The O(1) __index__ check is probably rather inconsequential compared
to the O(n) cost of actually performing the slicing.

 It's also a monkey patch, in that index() allows *conflicting* assumptions
 in violation of the unexpected monkey patch interaction worry.

 eg: Numpy *CAN* release an index() function on their floats -- at which
 point a basic no touch class (list itself) will now accept float as an index
 in direct contradiction of PEP 357's comment on floats... see?

Such a change would only affect numpy floats, not all floats, so it
would not be a monkey-patch.  In any case, that would be incorrect
usage of __index__.  We're all consenting adults here; we don't need
supervision to protect us from buggy third-party code.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-11-01 Thread 88888 Dihedral

andrew...@gmail.com於 2012年10月29日星期一UTC+8上午11時12分11秒寫道：
 The slice operator does not give any way (I can find!) to take slices from 
 negative to positive indexes, although the range is not empty, nor the 
 expected indexes out of range that I am supplying.
 
 
 
 Many programs that I write would require introducing variables and logical 
 statements to correct the problem which is very lengthy and error prone 
 unless there is a simple work around.
 
 
 
 I *hate* replicating code every time I need to do this!
 
 
 
 I also don't understand why slice() is not equivalent to an iterator, but can 
 replace an integer in __getitem__() whereas xrange() can't.
 
 
 
 
 
 Here's an example for Linux shell, otherwise remove /bin/env...
 
 {{{#!/bin/env python
 
 a=[1,2,3,4,5,6,7,8,9,10]
 
 print a[-4:3]  # I am interested in getting [7,8,9,10,1,2] but I get [].
 
 }}}
I'll suggest to use the reverse method
to get what you want.

Of course, the reverse method is not efficient for 
a list of a huge number of objects in python.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-11-01 Thread Steven D'Aprano

On Thu, 01 Nov 2012 15:25:51 -0700, Andrew Robinson wrote:

 On 11/01/2012 12:07 PM, Ian Kelly wrote:

 Pep 357 merely added cruft with index(), but really solved nothing. 
 Everything index() does could be implemented in __getitem__ and
 usually is.

 No.  There is a significant difference between implementing this on the
 container versus implementing it on the indexes.  Ethan implemented his
 string-based slicing on the container, because the behavior he wanted
 was specific to the container type, not the index type.  Custom index
 types like numpy integers on the other hand implement __index__ on the
 index type, because they apply to all sequences, not specific
 containers.
 
 Hmmm...
 D'Aprano didn't like the monkey patch;and sub-classing was his fix-all.

I pointed out that monkey-patching is a bad idea, even if it worked. But 
it doesn't work -- you simply cannot monkey-patch built-ins in Python. 
Regardless of whether I like the m-p or not, *you can't use it* because 
you patch built-in list methods.

The best you could do is subclass list, then shadow the built-in name 
list with your subclass. But that gives all sorts of problems too, in 
some ways even worse than monkey-patching.

You started this thread with a question about slicing. You believe that 
one particular use-case for slicing, which involves interpreting lists as 
circular rather than linear, is the use-case that built-in list slicing 
should have supported.

Fine, you're entitled to your option. But that boat has sailed about 20 
years ago. Python didn't make that choice, and it won't change now. If 
you write up a PEP, you could aim to have the built-in behaviour changed 
for Python 4 in perhaps another 10-15 years or so. But for the time 
being, that's not what lists, tuples, strings, etc. do. If you want that 
behaviour, if you want a circular list, then you have to implement it 
yourself, and the easiest way to do so is with a subclass.

That's not a fix-all. I certainly don't believe that subclassing is the 
*only* way to fix this, nor that it will fix all things. But it might 
fix *some* things, such as you wanting a data type that is like a 
circular list rather than a linear list.

If you prefer to create a circular-list class from scratch, re-
implementing all the list-like behaviour, instead of inheriting from an 
existing class, then by all means go right ahead. If you have a good 
reason to spend days or weeks writing, testing, debugging and fine-tuning 
your new class, instead of about 15 minutes with a subclass, then I'm 
certainly not going to tell you not to.


 Part of my summary is based on that conversation with him,and you
 touched on one of the unfinished  points; I responded to him that I
 thought __getitem__ was under-developed.   The object slice() has no
 knowledge of the size of the sequence; nor can it get that size on it's
 own, but must passively wait for it to be given to it.

That's because the slice object is independent of the sequence. As I 
demonstrated, you can pass a slice object to multiple sequences. This is 
a feature, not a bug.


 The bottom line is:  __getitem__ must always *PASS* len( seq ) to
 slice() each *time* the slice() object is-used.

The bottom line is: even if you are right, so what?

The slice object doesn't know what the length of the sequence is. What 
makes you think that __getitem__ passes the length to slice()? Why would 
it need to recreate a slice object that already exists?

It is the *sequence*, not the slice object, that is responsible for 
extracting the appropriate items when __getitem__ is called. __getitem__ 
gets a slice object as argument, it doesn't create one. It no more 
creates the slice object than mylist[5] creates the int 5.


 Since this is the case,

But it isn't.


 it would have been better to have list, itself, have a default member
 which takes the raw slice indicies and does the conversion itself.  The
 size would not need to be duplicated or passed -- memory savings, 
 speed savings...

We have already demonstrated that slice objects are smaller than (x)range 
objects and three-item tuples. In Python 3.3:

py sys.getsizeof(range(1, 10, 2))  # xrange remained in Python 3
24
py sys.getsizeof((1, 10, 2))
36
py sys.getsizeof(slice(1, 10, 2))
20


It might help you to be taken seriously if you base your reasoning on 
Python as it actually is, rather than counter-factual assumptions.


 
 I'm just clay pidgeoning an idea out here Let's apply D'Aprano 's
 logic to numpy; Numpy could just have subclassed *list*; 

Sure they could have, if numpy arrays were intended to be a small 
variation on Python lists. But they weren't, so they didn't.


 so let's ignore
 pure python as a reason to do anything on the behalf on Numpy:
 
 Then, lets' consider all thrid party classes;  These are where
 subclassing becomes a pain -- BUT: I think those could all have been
 injected.
 
   class ThirdParty( list ):  # Pretend this is someone else's...
 ... def

Re: Negative array indicies and slice()

2012-10-31 Thread Andrew Robinson


On 10/30/2012 10:29 PM, Michael Torrie wrote:
As this is the case, why this long discussion? If you are arguing for 
a change in Python to make it compatible with what this fork you are 
going to create will do, this has already been fairly thoroughly 
addressed earl on, and reasons why the semantics will not change 
anytime soon have been given. 


I'm not arguing for a change in the present release of Python; and I 
have never done so.
Historically, if a fork happens to produce something surprisingly 
_useful_; the main code bank eventually accepts it on their own.  If a 
fork is a mistake, it dies on its own.


That really is the way things ought to be done.

   include this
   The Zen of Python, by _Tim Peters_
   
   Special cases aren't special enough to break the rules.
   Although _practicality beats purity_.
   

Now, I have seen several coded projects where the idea of cyclic lists 
is PRACTICAL;
and the idea of iterating slices may be practical if they could be made 
*FASTER*.


These warrant looking into -- and carefully;  and that means making an 
experimental fork; preferably before I attempt to micro-port the python.


Regarding the continuing discussion:
The more I learn, the more informed decisions I can make regarding 
implementation.

I am almost fully understanding the questions I originally asked, now.

What remains are mostly questions about compatibility wrappers, and how 
to allow them to be used -- or selectively deleted when not necessary; 
and perhaps a demonstration or two about how slices and named tuples can 
(or can't) perform nearly the same function in slice processing.


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-31 Thread Ian Kelly

On Tue, Oct 30, 2012 at 4:25 PM, Andrew Robinson
andr...@r3dsolutions.com wrote:
 Ian,

 Looks like it's already been wontfixed back in 2006:

 http://bugs.python.org/issue1501180

 Absolutely bloody typical, turned down because of an idiot.  Who the hell is
 Tim Peters anyway?

 I don't really disagree with him, anyway.  It is a rather obscure bug
 -- is it worth increasing the memory footprint of slice objects by 80%
 in order to fix it?

 :D

 In either event, a *bug* does exist (at *least* 20% of the time.)  Tim
 Peters could have opened the *appropriate* bug complaint if he rejected the
 inappropriate one.

Where are you getting that 20% figure from?  Reference cycles
involving slice objects would be extremely rare, certainly far less
than 20%.

 The API ought to have either 1) included the garbage collection, or 2)
 raised an exception anytime dangerous/leaky data was supplied to slice().

How would you propose detecting the latter?  At the time data is
supplied to slice() it cannot refer to the slice, as the slice does
not exist yet.  The cycle has to be created after.

 If it is worth getting rid of the 4 words of extra memory required for the
 GC -- on account of slice() refusing to support data with sub-objects; then
 I'd also point out that a very large percentage of the time, tuples also
 contain data (typically integers or floats,) which do not further
 sub-reference objects.  Hence, it would be worth it there too.

I disagree.  The proportion of the time that a tuple contains other
collection objects is *much* greater.  This happens regularly.  OTOH,
if I had to hazard a guess at the frequency with which non-atomic
objects are used in slices, it would be a fraction of a fraction of a
fraction of a percent.

 I came across some unexpected behavior in Python 3.2 when experimenting with
 ranges and replacement

 Consider, xrange is missing, BUT:

More accurately, range is gone, and xrange has been renamed range.

 a=range(1,5,2)
 a[1]
 3
 a[2]
 5
 a[1:2]
 range(3, 5, 2)

 Now, I wondered if it would still print the array or not; eg: if this was a
 __str__ issue vs. __repr__.

 print( a[1:2] ) # Boy, I have to get used to the print's parenthesis
 range(3, 5, 2)

 So, the answer is *NOPE*.

I'm not sure why you would expect it to print a list here, without an
explicit conversion.  The result of calling range in Python 3 is a
range object, not a list.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-31 Thread Steven D'Aprano

On Tue, 30 Oct 2012 21:33:32 +, Mark Lawrence wrote:

 On 30/10/2012 18:02, Ian Kelly wrote:
 On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furman et...@stoneleaf.us
 wrote:
 File a bug report?

 Looks like it's already been wontfixed back in 2006:

 http://bugs.python.org/issue1501180


 Absolutely bloody typical, turned down because of an idiot.  Who the
 hell is Tim Peters anyway? :)

I see your smiley, but for the benefit of those who actually don't know 
who Tim Peters, a.k.a. the Timbot, is, he is one of the gurus of Python 
history. He invented Python's astonishingly excellent sort routine, 
Timsort, and popularised the famous adverbial phrase signoffs you will 
see in a lot of older posts.

Basically, he is in the pantheon of early Python demigods.


stop-me-before-i-start-gushing-over-the-timbot-ly y'rs,



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-31 Thread Mark Lawrence


On 31/10/2012 10:07, Steven D'Aprano wrote:

On Tue, 30 Oct 2012 21:33:32 +, Mark Lawrence wrote:


Absolutely bloody typical, turned down because of an idiot.  Who the
hell is Tim Peters anyway? :)


I see your smiley, but for the benefit of those who actually don't know
who Tim Peters, a.k.a. the Timbot, is, he is one of the gurus of Python
history. He invented Python's astonishingly excellent sort routine,
Timsort, and popularised the famous adverbial phrase signoffs you will
see in a lot of older posts.

Basically, he is in the pantheon of early Python demigods.

stop-me-before-i-start-gushing-over-the-timbot-ly y'rs,



4 / 10, must try harder, the omission of the Zen of Python is considered 
a very serious matter :)


--
Cheers.

Mark Lawrence.

--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-31 Thread Ian Kelly

On Wed, Oct 31, 2012 at 7:42 AM, Andrew Robinson
andr...@r3dsolutions.comwrote:

 Then; I'd note:  The non-goofy purpose of slice is to hold three data
 values;  They are either numbers or None.  These *normally* encountered
 values can't create a memory loop.
 So, FOR AS LONG, as the object representing slice does not contain an
 explicit GC pair; I move that we mandate (yes, in the current python
 implementation, even as a *fix*) that its named members may not be assigned
 any objects other than None or numbers

 eg: Lists would be forbidden

 Since functions, and subclasses, can be test evaluated by int(
 the_thing_to_try ) and *[] can too,
 generality need not be lost for generating nothing or numbers.


PEP 357 requires that anything implementing the __index__ special method be
allowed for slicing sequences (and also that __index__ be used for the
conversion).  For the most part, that includes ints and numpy integer
types, but other code could be doing esoteric things with it.

The change would be backward-incompatible in any case, since there is
certainly code out there that uses non-numeric slices -- one example has
already been given in this thread.
And more wonderful yet, when I do extended slice replacement -- it gives me
results beyond my wildest imaginings!


  a=[0,1,2,3,4,5]
  a[4:5]=range( 0, 3 ) # Size origin=1, Size dest =3
  a
 [0, 1, 2, 3, 0, 1, 2, 5]  # Insert on top of replacement
 
 But !!!NOT!!! if I do it this way:
  a[4]=range( 0, 3 )
  a
 [0, 1, 2, 3, range(0, 3), 1, 2, 5]
 


That's nothing to do with range or Python 3.  It's part of the difference
between slice assignment and index assignment.  The former unpacks an
iterable, and the latter assigns a single object.  You'd get the same
behavior with lists:

 a = list(range(6))
 a[4:5] = list(range(3))
 a
[0, 1, 2, 3, 0, 1, 2, 5]
 a = list(range(6))
 a[4] = list(range(3))
 a
[0, 1, 2, 3, [0, 1, 2], 5]

Slice assignment unpacks the list; index assignment assigns the list itself
at the index.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-31 Thread Andrew Robinson


On 10/31/2012 02:20 PM, Ian Kelly wrote:

On Wed, Oct 31, 2012 at 7:42 AM, Andrew Robinson wrote:

Then; I'd note:  The non-goofy purpose of slice is to hold three
data values;  They are either numbers or None.  These *normally*
encountered values can't create a memory loop.
So, FOR AS LONG, as the object representing slice does not contain
an explicit GC pair; I move that we mandate (yes, in the current
python implementation, even as a *fix*) that its named members may
not be assigned any objects other than None or numbers

eg: Lists would be forbidden

Since functions, and subclasses, can be test evaluated by int(
the_thing_to_try ) and *[] can too,
generality need not be lost for generating nothing or numbers.


PEP 357 requires that anything implementing the __index__ special 
method be allowed for slicing sequences (and also that __index__ be 
used for the conversion).  For the most part, that includes ints and 
numpy integer types, but other code could be doing esoteric things 
with it.


I missed something... (but then that's why we're still talking about it...)

Reading the PEP, it notes that *only* integers (or longs) are permitted 
in slice syntax.

(Overlooking None, of course... which is strange...)

The PEP gives the only exceptions as objects with method __index__.

Automatically, then, an empty list is forbidden (in slice syntax).
However,  What you did, was circumvent the PEP by passing an empty list 
directly to slice(), and avoiding running it through slice syntax 
processing.


So...
Is there documentation suggesting that a slice object is meant to be 
used to hold anything other than what comes from processing a valid 
slice syntax [::]??. (we know it can be done, but that's a different Q.)



The change would be backward-incompatible in any case, since there is 
certainly code out there that uses non-numeric slices -- one example 
has already been given in this thread.

Hmmm.

Now, I'm thinking -- The purpose of index(), specifically, is to notify 
when something which is not an integer may be used as an index;  You've 
helpfully noted that index() also *converts* those objects into numbers.


Ethan Fullman mentioned that he used the names of fields, instead of 
having to remember the _offsets_; Which means that his values _do 
convert_ to offset numbers


His example was actually given in slice syntax notation [::].
Hence, his objects must have an index() method, correct?.

Therefore, I still see no reason why it is permissible to assign 
non-numerical (non None) items
as an element of slice().  Or, let me re-word that more clearly -- I see 
no reason that slice named members when used as originally intended 
would ever need to be assigned a value which is not *already* converted 
to a number by index().  By definition, if it can't be coerced, it isn't 
a number.


A side note:
At 80% less overhead, and three slots -- slice is rather attractive to 
store RGB values in for a picture!  But, I don't think anyone would have 
a problem saying No, we won't support that, even if you do do it!


So, what's the psychology behind allowing slice() to hold objects which 
are not converted to ints/longs in the first place?


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-30 Thread Ian Kelly

On Mon, Oct 29, 2012 at 12:00 PM, Andrew Robinson
andr...@r3dsolutions.com wrote:
 I downloaded the source code for python 3.3.0, as the tbz;
 In the directory Python-3.3.0/Python, look at Python-ast.c, line 2089 
 ff.

Python-ast.c is part of the compiler code.  That's not the struct used
to represent the object at runtime, but the struct used to represent
the AST node while compiling.

For the runtime definition of slices, look at sliceobject.h and
sliceobject.c.  Slices are represented as:

typedef struct {
PyObject_HEAD
PyObject *start, *stop, *step;  /* not NULL */
} PySliceObject;

PyObject_HEAD is a macro that incorporates the object type pointer
and the reference count.  Hence, 5 words.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-30 Thread Andrew Robinson


Hi Ian,

There are several interesting/thoughtful things you have written.
I like the way you consider a problem before knee jerk answering.

The copying you mention (or realloc) doesn't re-copy the objects on the 
list.
It merely re-copies the pointer list to those objects. So lets see what 
it would do...


I have seen doubling as the supposed re-alloc method, but I'll assume 
1.25 --

so, 1.25**x = 20million, is 76 copies (max).

The final memory copy would leave about a 30MB hole.
And my version of Python operates initially with a 7MB virtual footprint.

Sooo If the garbage collection didn't operate at all, the copying 
would waste around:


 z,w = 30e6,0
 while (z1): w,z = w+z, z/1.25
...
 print(w)
14995.8589521

eg: 150MB cummulative.
The doubles would amount to 320Megs max.

Not enough to fill virtual memory up; nor even cause a swap on a 2GB 
memory machine.

It can hold everything in memory at once.

So, I don't think Python's memory management is the heart of the problem,
although memory wise-- it does require copying around 50% of the data.

As an implementation issue, though, the large linear array may cause 
wasteful caching/swapping loops, esp, on smaller machines.


On 10/29/2012 10:27 AM, Ian Kelly wrote:

Yes, I misconstrued your question.  I thought you wanted to change the
behavior of slicing to wrap around the end when start  stop instead
of returning an empty sequence. ...  Chris has
already given ...  You
could also use map for this:

new_seq = list(map(old_seq.__getitem__, iterable))

MMM... interesting.

I am not against changing the behavior, but I do want solutions like you 
are offering.
As I am going to implement a python interpreter, in C,  being able to do 
things differently could significantly reduce the interpreter's size.


However, I want to break existing scripts very seldom...

I'm aware of what is possible in C with pointer arithmetic. This is 
Python, though, and Python by design has neither pointers nor pointer 
arithmetic. In any case, initializing the pointer to the end of the 
array would still not do what you want, since the positive indices 
would then extend past the end of the array. 


Yes, *and* if you have done assembly language programming -- you know 
that testing for sign is a trivial operation.  It doesn't even require a 
subtraction.  Hence, at the most basic machine level -- changing the 
base pointer *once* during a slice operation is going to be far more 
efficient than performing multiple subtractions from the end of an 
array, as the Python API defines.
I'll leave out further gory details... but it is a Python interpreter 
built in C issue.


--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-30 Thread Ian Kelly

On Mon, Oct 29, 2012 at 4:39 PM, Andrew Robinson
andr...@r3dsolutions.com wrote:
 In addition to those items you mention, of which the reference count is not
 even *inside* the struct -- there is additional debugging information not
 mentioned.  Built in objects contain a line number, a column number, and
 a context pointer.  These each require a full word of storage.

 Also, built in types appear to have a kind field which indicates the
 object type but is not a pointer.  That suggests two object type
 indicators, a generic pointer (probably pointing to builtin? somewhere
 outside the struct) and a specific one (an enum) inside the C struct.

 Inside the tuple struct, I count 4 undocumented words of information.
 Over all, there is a length, the list of pointers, a kind, line, col
 and context; making 6 pieces in total.

 Although your comment says the head pointer is not required; I found in
 3.3.0 that it is a true head pointer; The Tuple() function on line 2069 of
 Python-ast.c, (3.3 version) -- is passed in a pointer called *elts.  That
 pointer is copied into the Tuple struct.

As above, you're looking at the compiler code, which is why you're
finding things like line and column.  The tuple struct is defined
in tupleobject.h and stores tuple elements in a tail array.

 How ironic,  slices don't have debugging info, that's the main reason they
 are smaller.
 When I do slice(3,0,2), suprisingly Slice() is NOT called.
 But when I do a[1:2:3] it *IS* called.

Because compiling the latter involves parsing slicing syntax, and
compiling the former does not. :-)
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-30 Thread Andrew Robinson


On 10/29/2012 04:01 PM, Ian Kelly wrote:

On Mon, Oct 29, 2012 at 9:20 AM, Andrew Robinson
andr...@r3dsolutions.com  wrote:

FYI: I was asking for a reason why Python's present implementation is
desirable...

I wonder, for example:

Given an arbitrary list:
a=[1,2,3,4,5,6,7,8,9,10,11,12]

Why would someone *want* to do:
a[-7,10]
Instead of saying
a[5:10] or a[-7:-2] ?

A quick search of local code turns up examples like this:

if name.startswith('{') and name.endswith('}'):
 name = name[1:-1]

Which is done to avoid explicitly calling the len() operator.

If slices worked like ranges, then the result of that would be empty,
which is obviously not desirable.
Yes, and that's an excellent point -- but note what I am showing in the 
example.  It is that example, which I am specifying.  There are only two 
cases where I think the default behavior of Python gives undesirable 
results:


The step is positive, and the pair of indexes goes from negative to 
positive.
Likewise, If the pair went from positive to negative, and the step was 
negative.


In all other combinations, the default behavior of python ought to 
remain intact.
I apologize for not making this crystal clear -- I thought you would 
focus on the specific example I gave.



I don't know of a reason why one might need to use a negative start
with a positive stop, though.
I've already given several examples; and another poster did too -- eg: 
Gene sequences for bacteria.  It's not uncommon to need this.  If I do 
some digging, I can also show some common graphics operations that 
benefit greatly from this ability -- NOTE: in another thread I just 
showed someone how to operate on RGBA values...  Slicing becomes THE 
major operation done when converting, or blitting, graphics data. etc.


Another example -- Jpeg, for example, uses discrete cosines -- which are 
a naturally cyclic data type.  They repeat with a fixed period.  I know 
there are C libraries already made for Jpeg -- but that doesn't mean 
many other applications with no C library aren't plagued by this problem.


I don't know how to make this point more clear.  There really *ARE* 
applications that uses cyclic lists of data; or which can avoid extra 
logic to fix problems encountered from linear arrays which *end* at a 
particular point.


sometimes it is desirable for a truncation to occur, sometimes it's 
NOT.  The sign convention test I outlined, I believe, clearly detects 
when a cyclic data set is desired. If there are normal examples where my 
tests fail -- that's what's important to me.



--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-30 Thread Andrew Robinson


On 10/29/2012 10:53 PM, Michael Torrie wrote:

On 10/29/2012 01:34 PM, Andrew Robinson wrote:

No, I don't think it big and complicated.  I do think it has timing
implications which are undesirable because of how *much* slices are used.
In an embedded target -- I have to optimize; and I will have to reject
certain parts of Python to make it fit and run fast enough to be useful.

Since you can't port the full Python system to your embedded machine
anyway, why not just port a subset of python and modify it to suit your
needs right there in the C code.  It would be a fork, yes,

You're exactly right;  That's what I *know* I am faced with.


   Without a libc, an MMU on the CPU, and a kernel, it's not going to just 
compile and run.
I have libc.  The MMU is a problem; but the compiler implements the 
standard C math library; floats, though, instead of doubles.  That's 
the only problem -- there.

  What you want with slicing behavior changes has no
place in the normal cPython implementation, for a lot of reasons.  The
main one is that it is already possible to implement what you are
talking about in your own python class, which is a fine solution for a
normal computer with memory and CPU power available.
If the tests I outlined in the previous post inaccurately describe a 
major performance improvement and at least a modest code size reduction; 
or will *often* introduce bugs -- I *AGREE* with you.


Otherwise, I don't.  I don't think wasting extra CPU power is a good 
thing -- Extra CPU power can always be used by something else


I won't belabor the point further.  I'd love to see a counter example to 
the specific criteria I just provided to IAN -- it would end my quest; 
and be a good reference to point others to.



--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-30 Thread Andrew Robinson


On 10/29/2012 11:51 PM, Ian Kelly wrote:

On Mon, Oct 29, 2012 at 4:39 PM, Andrew Robinson

As above, you're looking at the compiler code, which is why you're
finding things like line and column.  The tuple struct is defined
in tupleobject.h and stores tuple elements in a tail array.



If you re-check my post to chris, I listed the struct you mention.
The C code is what is actually run (by GDB breakpoint test) when a tuple 
is instantiated.
If the tuple were stripped of the extra data -- then it ought to be as 
small as slice().
But it's not as small -- so either the sys.getsizeof() is lying -- or 
the struct you mention is not complete.


Which?

--Andrew.

--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-30 Thread Ian Kelly

On Mon, Oct 29, 2012 at 7:49 PM, Chris Kaynor ckay...@zindagigames.com wrote:
 NOTE: The above is taken from reading the source code for Python 2.6.
 For some odd reason, I am getting that an empty tuple consists of 6
 pointer-sized objects (48 bytes on x64), rather than the expected 3
 pointer-sized (24 bytes on x64). Slices are showing up as the expected
 5 pointer-sized (40 bytes on x64), and tuples grow at the expected 1
 pointer (8 bytes on x64) per item. I imagine I am missing something,
 but cannot figure out what that would be.

I'm likewise seeing 4 extra words in tuples in 32-bit Python 3.3.
What I've found is that for tuples and other collection objects, the
garbage collector tacks on an extra header before the object in
memory.  That header looks like this:

typedef union _gc_head {
struct {
union _gc_head *gc_next;
union _gc_head *gc_prev;
Py_ssize_t gc_refs;
} gc;
long double dummy;  /* force worst-case alignment */
} PyGC_Head;

gc_next and gc_prev implement a doubly-linked list that the garbage
collector uses to explicitly track this object.  gc_refs is used for
counting references during a garbage collection and stores the GC
state of the object otherwise.

I'm not entirely certain why collection objects get this special
treatment, but there you have it.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-30 Thread Ian Kelly

On Mon, Oct 29, 2012 at 6:17 PM, Andrew Robinson
andr...@r3dsolutions.com wrote:
 If you re-check my post to chris, I listed the struct you mention.
 The C code is what is actually run (by GDB breakpoint test) when a tuple is
 instantiated.

When you were running GDB, were you debugging the interactive
interpreter or a precompiled script?  The interactive interpreter does
a compilation step for every line entered.

 If the tuple were stripped of the extra data -- then it ought to be as small
 as slice().
 But it's not as small -- so either the sys.getsizeof() is lying -- or the
 struct you mention is not complete.

As just explained, the extra 16 bytes are added by the garbage collector.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-30 Thread Ian Kelly

On Mon, Oct 29, 2012 at 5:54 PM, Andrew Robinson
andr...@r3dsolutions.com wrote:
 I don't know of a reason why one might need to use a negative start
 with a positive stop, though.

 I've already given several examples; and another poster did too

I meant that I don't know of a reason to do that given the existing
semantics, which is what you were asking for.  :-)

I understand and agree that there are potential applications for
allowing slices to wrap around from negative to positive.  What I'm
not convinced of is that these applications need or should be handled
by the slicing operation -- which is more commonly understood as
producing subsequences -- especially since they already can be handled
relatively easily by iteration.  I think that more users would find
the behavior surprising than useful.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-30 Thread Steven D'Aprano

By the way Andrew, the timestamps on your emails appear to be off, or 
possibly the time zone. Your posts are allegedly arriving before the 
posts you reply to, at least according to my news client.


On Mon, 29 Oct 2012 12:34:24 -0700, Andrew Robinson wrote:

 On 10/29/2012 05:02 PM, Steven D'Aprano wrote:
 On Mon, 29 Oct 2012 08:42:39 -0700, Andrew Robinson wrote:

 But, why can't I just overload the existing __getitem__ for lists
 and not bother writing an entire class?
 You say that as if writing an entire class was a big complicated
 effort. It isn't. It is trivially simple, a single line:

 class MyList(list):
  ...
 No, I don't think it big and complicated.  I do think it has timing
 implications which are undesirable because of how *much* slices are
 used. In an embedded target -- I have to optimize; and I will have to
 reject certain parts of Python to make it fit and run fast enough to be
 useful.

Then I look forward to seeing your profiling results that show that the 
overhead of subclassing list is the bottleneck in your application.

Until then, you are making the classic blunder of the premature optimizer:

More computing sins are committed in the name of efficiency (without 
necessarily achieving it) than for any other single reason — including 
blind stupidity. — W.A. Wulf


I am not impressed by performance arguments when you have (apparently) 
neither identified the bottlenecks in your code, nor even measured the 
performance. You are essentially *guessing* where the bottlenecks are, 
and *hoping* that some suggested change will be an optimization rather 
than a pessimization.

Of course I may be wrong, and you have profiled your code and determined 
that the overhead of inheritance is a problem. If so, that's a different 
ball game. But your posts so far suggest to me that you're trying to 
predict performance optimizations rather than measure them.


 You can just overload that one method in a subclass of list.  Being
 able to monkey-patch __getitem__ for the list class itself would not
 be advisable, as it would affect all list slicing anywhere in your
 program and possibly lead to some unexpected behaviors.
 That's what I am curious about.
 What unexpected behaviors would a monkey patch typically cause?
 What part of unexpected is unclear?

 Ahh -- The I don't know approach!  It's only unexpected if one is a bad
 programmer...!

No, it is unexpected because once you start relying on monkey-patching, 
and the libraries you install are monkey-patching, you have a 
combinational explosion of interactions. Any piece of code, anywhere, 
could monkey-patch any other piece of code -- it is a form of action-at-a-
distance coding, like GOTOs and global variables. Use it with caution, in 
moderation.


 Let me see if I can illustrate a flavour of the sort of things that can
 happen if monkey-patching built-ins were allowed.
[...]
 Right, which means that people developing the libraries made
 contradictory assumptions.

Not necessarily. Not only can monkey-patches conflict, but they can 
combine in bad ways. It isn't just that Fred assumes X and Barney assumes 
not-X, but also that Fred assumes X and Barney assumes Y and *nobody* 
imagined that there was some interaction between X and Y.



[...]
 Ruby allows monkey-patching of everything. And the result was
 predictable:

 http://devblog.avdi.org/2008/02/23/why-monkeypatching-is-destroying-
ruby/


 I read that post carefully; and the author purposely notes that he is
 exaggerating.

Not carefully enough. He notes that he was using a provocative title and 
that he doesn't actually think that Ruby is being destroyed. But the 
actual harm he describes is real, e.g. bugs that take months to track 
down.


 What you are talking about is namespace preservation; 

I haven't mentioned namespaces. Nothing I have said has anything to do 
with namespaces. I remember Apple monkey-patching routines in ROM back in 
the mid 1980s, long before there was anything corresponding to namespaces 
in Apple's programming model. 


 and I am thinking
 about it. I can preserve it -- but only if I disallow true Python
 primitives in my own interpreter; I can't provide two sets in the memory
 footprint I am using.

If you want to write a language that is not Python, go right ahead.


 If someone had a clear explanation of the disadvantages of allowing an
 iterator, or a tuple -- in place of a slice() -- I would have no qualms
 dropping the subject.  However, I am not finding that yet.  I am finding
 very small optimization issues...

Python has certain public interfaces. If your language does not support 
those public interfaces, then it might be an awesome language, but it is 
not Python.

Python slices have certain features:

* they can be used repeatedly;

* they have public attributes start, step and stop;

* The stop attribute can be None, and the slice will default to the
  length of the thing being sliced, which is not known until call-time.

* Slices have a

Re: Negative array indicies and slice()

2012-10-30 Thread Ian Kelly

On Tue, Oct 30, 2012 at 1:21 AM, Ian Kelly ian.g.ke...@gmail.com wrote:
 I'm not entirely certain why collection objects get this special
 treatment, but there you have it.

Thinking about it some more, this makes sense.  The GC header is there
to support garbage collection for the object.  Atomic types like ints
do not need this header because they do not reference other objects
and so cannot be involved in reference cycles.  For those types,
reference counting is sufficient.  For types like collections that do
reference other objects, garbage collection is needed.

Expanding on this, I suspect it is actually a bug that slice objects
are not tracked by the garbage collector.  The following code appears
to result in a memory leak:

import gc
gc.disable()
while True:
for i in range(100):
l = []
s = slice(l)
l.append(s)
del s, l
_ = gc.collect()

Try running that and watch your Python memory usage climb and climb.
For contrast, replace the slice with a list and observe that memory
usage does *not* climb.  On each iteration, the code constructs a
reference cycle between a slice and a list.  It seems that because
slices are not tracked by the garbage collector, it is unable to break
these cycles.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-30 Thread Ian Kelly

On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furman et...@stoneleaf.us wrote:
 File a bug report?

Looks like it's already been wontfixed back in 2006:

http://bugs.python.org/issue1501180
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-30 Thread Andrew Robinson


On 10/30/2012 11:02 AM, Ian Kelly wrote:

On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furmanet...@stoneleaf.us  wrote:

File a bug report?

Looks like it's already been wontfixed back in 2006:

http://bugs.python.org/issue1501180
Thanks, IAN, you've answered the first of my questions and have been a 
great help.
(And yes, I was debugging interactive mode... I took a nap after writing 
that post, as I realized I had reached my 1 really bad post for the day... )


I at least I finally know why Python chooses to implement slice() as a 
separate object from tuple; even if I don't like the implications.


I think there are three main consequences of the present implementation 
of slice():


1) The interpreter code size is made larger with no substantial 
improvement in functionality, which increases debugging effort.
2) No protection against perverted and surprising (are you surprised?! I 
am) memory operation exists.
3) There is memory savings associated with not having garbage collection 
overhead.


D'Apriano mentioned the named values, start, stop, step in a slice() 
which are an API and legacy issue;  These three names must also be 
stored in the interpreter someplace.  Since slice is defined at the C 
level as a struct, have you already found these names in the source code 
(hard-coded), or are they part of a .py file associated with the 
interface to the C code?


--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-30 Thread Ethan Furman


Andrew Robinson wrote:

I can see that the slice() function can pass in arbitrary arguments.
I'm not sure for lists, which is what the range is applied to, why an 
argument like a would be part of a slice.


Well, in my dbf.py Record class you can use the names of fields as the slice arguments, instead of 
having to remember the offsets.  record['full_name':'zip4'] returns a tuple (or a list, I don't 
remember) of about 13 fields -- this is especially useful as that block of fields might not be in 
the same place in each table.


~Ethan~
--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-30 Thread Mark Lawrence


On 30/10/2012 18:02, Ian Kelly wrote:

On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furman et...@stoneleaf.us wrote:

File a bug report?


Looks like it's already been wontfixed back in 2006:

http://bugs.python.org/issue1501180



Absolutely bloody typical, turned down because of an idiot.  Who the 
hell is Tim Peters anyway? :)


--
Cheers.

Mark Lawrence.

--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-30 Thread Ian Kelly

On Tue, Oct 30, 2012 at 3:33 PM, Mark Lawrence breamore...@yahoo.co.uk wrote:
 On 30/10/2012 18:02, Ian Kelly wrote:

 On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furman et...@stoneleaf.us wrote:

 File a bug report?


 Looks like it's already been wontfixed back in 2006:

 http://bugs.python.org/issue1501180


 Absolutely bloody typical, turned down because of an idiot.  Who the hell is
 Tim Peters anyway? :)

I don't really disagree with him, anyway.  It is a rather obscure bug
-- is it worth increasing the memory footprint of slice objects by 80%
in order to fix it?
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-30 Thread Ian Kelly

On Tue, Oct 30, 2012 at 8:21 AM, Andrew Robinson
andr...@r3dsolutions.com wrote:
 D'Apriano mentioned the named values, start, stop, step in a slice() which
 are an API and legacy issue;  These three names must also be stored in the
 interpreter someplace.  Since slice is defined at the C level as a struct,
 have you already found these names in the source code (hard-coded), or are
 they part of a .py file associated with the interface to the C code?

You mean the mapping of Python attribute names to C struct members?
That's in sliceobject.c:

static PyMemberDef slice_members[] = {
{start, T_OBJECT, offsetof(PySliceObject, start), READONLY},
{stop, T_OBJECT, offsetof(PySliceObject, stop), READONLY},
{step, T_OBJECT, offsetof(PySliceObject, step), READONLY},
{0}
};
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-30 Thread Chris Angelico

On Wed, Oct 31, 2012 at 8:47 AM, Ian Kelly ian.g.ke...@gmail.com wrote:
 On Tue, Oct 30, 2012 at 3:33 PM, Mark Lawrence breamore...@yahoo.co.uk 
 wrote:
 On 30/10/2012 18:02, Ian Kelly wrote:

 On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furman et...@stoneleaf.us wrote:

 File a bug report?


 Looks like it's already been wontfixed back in 2006:

 http://bugs.python.org/issue1501180


 Absolutely bloody typical, turned down because of an idiot.  Who the hell is
 Tim Peters anyway? :)

 I don't really disagree with him, anyway.  It is a rather obscure bug
 -- is it worth increasing the memory footprint of slice objects by 80%
 in order to fix it?

Bug report: If I take this gun, aim it at my foot, and pull the
trigger, sometimes a hole appears in my foot.

This is hardly normal use of slice objects. And the penalty isn't a
serious one unless you're creating cycles repeatedly.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-30 Thread Ian Kelly

On Tue, Oct 30, 2012 at 3:55 PM, Ian Kelly ian.g.ke...@gmail.com wrote:
 On Tue, Oct 30, 2012 at 8:21 AM, Andrew Robinson
 andr...@r3dsolutions.com wrote:
 D'Apriano mentioned the named values, start, stop, step in a slice() which
 are an API and legacy issue;  These three names must also be stored in the
 interpreter someplace.  Since slice is defined at the C level as a struct,
 have you already found these names in the source code (hard-coded), or are
 they part of a .py file associated with the interface to the C code?

 You mean the mapping of Python attribute names to C struct members?
 That's in sliceobject.c:

 static PyMemberDef slice_members[] = {
 {start, T_OBJECT, offsetof(PySliceObject, start), READONLY},
 {stop, T_OBJECT, offsetof(PySliceObject, stop), READONLY},
 {step, T_OBJECT, offsetof(PySliceObject, step), READONLY},
 {0}
 };

Note that the slice API also includes the slice.indices method.

They also implement rich comparisons, but this appears to be done by
copying the data to tuples and comparing the tuples, which is actually
a bit ironic considering this discussion. :-)
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-30 Thread Andrew Robinson


On 10/30/2012 01:17 AM, Steven D'Aprano wrote:

By the way Andrew, the timestamps on your emails appear to be off, or
possibly the time zone. Your posts are allegedly arriving before the
posts you reply to, at least according to my news client.
:D -- yes, I know about that problem.  Every time I reboot it shows up 
again...
It's a distribution issue, my hardware clock is in local time -- but 
when the clock is read by different scripts in my distribution, some 
refuse to accept that the system clock is not UTC.
I'll be upgrading in a few weeks -- so I'm just limping along until 
then. My apology.



Then I look forward to seeing your profiling results that show that the
overhead of subclassing list is the bottleneck in your application.

Until then, you are making the classic blunder of the premature optimizer:

More computing sins are committed in the name of efficiency (without
necessarily achieving it) than for any other single reason — including
blind stupidity. — W.A. Wulf


I'm sure that's true.  Optimization, though, is a very general word.

On a highway in my neighborhood -- the government keeps trying to put 
more safety restrictions on it, because it statistically registers as 
the highest accident rate road in the *entire* region.


Naturally, the government assumes that people in my neighborhood are 
worse drivers than usual and need to be policed more -- but the truth 
is, that highway is the *ONLY* access road in the region for dozens of 
miles in any direction for a densely populated area, so if there is 
going to be an accident it will happen there; the extra safety 
precautions are not necessary when the accident rate is looked at from a 
per-capita perspective of those driving the highway.


I haven't made *the* blunder of the premature optimizer because I 
haven't implemented anything yet.  Premature optimizers don't bother to 
hold public conversation and take correction.
OTOH:  people who don't ever optimize out of fear, pay an increasing 
bloat price with time.



I am not impressed by performance arguments when you have (apparently)
neither identified the bottlenecks in your code, nor even measured the
performance.
Someone else already did a benchmark between a discrete loop and a slice 
operation.

The difference in speed was an order of magnitude different.
I bench-marked a map operation, which was *much* better -- but also 
still very slow in comparison.


Let's not confound an issue here -- I am going to implement the python 
interpreter; and am not bound by optimization considerations of the 
present python interpreter -- There are things I can do which as a 
python programmer -- you can't.  I have no choice but to re-implement 
and optimize the interpreter -- the question is merely how to go about it.



  You are essentially *guessing* where the bottlenecks are,
and *hoping* that some suggested change will be an optimization rather
than a pessimization.

Of course I may be wrong, and you have profiled your code and determined
that the overhead of inheritance is a problem. If so, that's a different
ball game. But your posts so far suggest to me that you're trying to
predict performance optimizations rather than measure them.
Not really; Inheritance itself and it's timing aren't my main concern.  
Even if the time was *0* that wouldn't change my mind.


There are man hours in debugging time caused by not being able to wrap 
around in a slice. (I am not ignoring the contrary man hours of an API 
change's bugs).


Human psychology is important; and it's a double edged sword.

I would refer you to a book written by Steve Maguire, Writing Solid 
Code; Chapter 5; Candy machine interfaces.


He uses the C function realloc() as an excellent example of a bad 
API; but still comments on one need that it *does* fulfill -- I've 
found it better to have one function that both shrinks and expands 
blocks so that I don't have to write *ifs* constructs every time I need 
to resize memory.  True, I give up some extra argument checking, but 
this is offset by the *ifs* that I no longer need to write (*and 
possibly mess up*).


* Extra steps that a programmer must take to achieve a task are places 
where bugs get introduced.


* API's which must be debugged to see what particular operation it is 
performing rather than knowing what that operation is from looking at 
the un-compiled code are places where bugs get introduced.


These two points are not friendly with each other -- they are in fact, 
generally in conflict.

Right, which means that people developing the libraries made
contradictory assumptions.

Not necessarily. Not only can monkey-patches conflict, but they can
combine in bad ways. It isn't just that Fred assumes X and Barney assumes
not-X, but also that Fred assumes X and Barney assumes Y and *nobody*
imagined that there was some interaction between X and Y.
They *STILL* made contradictory assumptions; each of them assumed the 
interaction mechanism would not be applied in a

Re: Negative array indicies and slice()

2012-10-30 Thread Mark Lawrence


On 30/10/2012 21:47, Ian Kelly wrote:

On Tue, Oct 30, 2012 at 3:33 PM, Mark Lawrence breamore...@yahoo.co.uk wrote:

On 30/10/2012 18:02, Ian Kelly wrote:


On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furman et...@stoneleaf.us wrote:


File a bug report?



Looks like it's already been wontfixed back in 2006:

http://bugs.python.org/issue1501180



Absolutely bloody typical, turned down because of an idiot.  Who the hell is
Tim Peters anyway? :)


I don't really disagree with him, anyway.  It is a rather obscure bug
-- is it worth increasing the memory footprint of slice objects by 80%
in order to fix it?



Thinking about it I entirely agree.  An 80% increase in memory foorprint 
where the slice objects are being used with Python 3.3.0 Unicode would 
have disastrous consequences given the dire state of said Unicode, which 
is why some regular contributors here are giving up with Python and 
using Go.


Oh gosh look at the time, I'm just going for a walk so I can talk with 
the Pixies at the bottom of my garden before they go night nights.


--
Cheers.

Mark Lawrence.

--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-30 Thread Mark Lawrence


On 30/10/2012 15:47, Andrew Robinson wrote:


I would refer you to a book written by Steve Maguire, Writing Solid
Code; Chapter 5; Candy machine interfaces.



The book that took a right hammering here 
http://accu.org/index.php?module=bookreviewsfunc=searchrid=467 ?


--
Cheers.

Mark Lawrence.

--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-30 Thread Andrew Robinson


On 10/30/2012 04:48 PM, Mark Lawrence wrote:

On 30/10/2012 15:47, Andrew Robinson wrote:


I would refer you to a book written by Steve Maguire, Writing Solid
Code; Chapter 5; Candy machine interfaces.



The book that took a right hammering here 
http://accu.org/index.php?module=bookreviewsfunc=searchrid=467 ?




Yes, although Chapter 5 is the one where the realloc() issue is discussed.
If you have a library, see if you can check the book out -- rather than 
spend $$$ on it.
But, in good humor --  Consider the only criticism the poster mentioned 
about chapter 5's contents.


   Occasionally, he presents a code fragment with a subtle bug, such as:

   p = realloc(p,n);

   _I have to admit that I didn't spot the bug_, but then I never use
   realloc, knowing it to have pitfalls. What is the bug? If realloc
   cannot allocate the memory, it returns NULL and the assignment means
   you lose your original pointer.

   What are the pitfalls? Realloc may or may not copy the data to a
   new, larger, area of memory and return the address of that: many
   programmers forget this and end up with pointers into the old,
   deallocated, area. Even those programmers who remember will likely
   fall into the trap that Maguire shows.

   _Back to 'clever' code though, he prefers_:


His critique is a bit like the scene in Monty Python's the Life of Br..an...
Where the aliens come, and crash, and leave -- and is totally irrelevant 
to what the plot-line is in the movie.  What does this comment have to 
do with a critique???  McGuire didn't fail to notice the bug!


But the critic doesn't even notice the main *pointS* the author was 
trying to make in that chapter.
There are, to be sure, recommendations that I don't agree with in the 
book;  He doesn't seem to do much Unit testing, postmortems, etc.  are 
all topics that I studied in a formal class on Software Engineering.  It 
was a wider perspective than McGuire brings to his book;


But that doesn't mean McGuire has nothing valuable to say!

A short Python Homage for readers of the linked Critique! :

I've had to follow GPL project style rules where the rule for a weird 
situation would be:
while (*condition) /* nothing */ ;   // and yes, this will sometimes 
generate a warning...


But, I have enough brains to take McGuire's *suggestion* to an improved 
Python conclusion.


#define PASS(x) {(void)NULL;}

while (*condition) PASS( Gas );  // There will be no warning

-- 
http://mail.python.org/mailman/listinfo/python-list

RE: Negative array indicies and slice()

2012-10-30 Thread Andrew Robinson


Ian,

  Looks like it's already been wontfixed back in 2006:

  http://bugs.python.org/issue1501180

Absolutely bloody typical, turned down because of an idiot.  Who the hell is
Tim Peters anyway?
  I don't really disagree with him, anyway.  It is a rather obscure bug
  -- is it worth increasing the memory footprint of slice objects by 80%
  in order to fix it?

:D

In either event, a *bug* does exist (at *least* 20% of the time.)  Tim 
Peters could have opened the *appropriate* bug complaint if he rejected 
the inappropriate one.


The API ought to have either 1) included the garbage collection, or 2) 
raised an exception anytime dangerous/leaky data was supplied to slice().


If it is worth getting rid of the 4 words of extra memory required for 
the GC -- on account of slice() refusing to support data with 
sub-objects; then I'd also point out that a very large percentage of the 
time, tuples also contain data (typically integers or floats,) which do 
not further sub-reference objects.  Hence, it would be worth it there too.


OTOH, if the GC is considered acceptable in non-sub-referenced tuples, 
GC ought to be acceptable in slice() as well.


Inconsistency is the mother of surprises; and code bloat through 
exceptions



Note that the slice API also includes the slice.indices method.

They also implement rich comparisons, but this appears to be done by
copying the data to tuples and comparing the tuples, which is actually
a bit ironic considering this discussion.

Yes, indeed!
I didn't mention the slice.indicies method -- as it's purpose is 
traditionally to *directly* feed the parameters of xrange or range.  ( I 
thought that might start a WAR! ). :D


http://docs.python.org/release/2.3.5/whatsnew/section-slices.html

class FakeSeq:
...
 return FakeSeq([self.calc_item(i) for i in_range(*indices)_])
else:
return self.calc_item(i)


And here I'm wondering why we can't just pass range into it directly... :(


I came across some unexpected behavior in Python 3.2 when experimenting 
with ranges and replacement


Consider, xrange is missing, BUT:
 a=range(1,5,2)
 a[1]
3
 a[2]
5
 a[1:2]
range(3, 5, 2)

Now, I wondered if it would still print the array or not; eg: if this 
was a __str__ issue vs. __repr__.


 print( a[1:2] ) # Boy, I have to get used to the print's parenthesis
range(3, 5, 2)

So, the answer is *NOPE*.
I guess I need to read the doc's all over again... it's ... well, quite 
different.

--Andrew.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-30 Thread Michael Torrie

On 10/30/2012 09:47 AM, Andrew Robinson wrote:
 Let's not confound an issue here -- I am going to implement the python 
 interpreter; and am not bound by optimization considerations of the 
 present python interpreter -- There are things I can do which as a 
 python programmer -- you can't.  I have no choice but to re-implement 
 and optimize the interpreter -- the question is merely how to go about it.

As this is the case, why this long discussion?  If you are arguing for a
change in Python to make it compatible with what this fork you are going
to create will do, this has already been fairly thoroughly addressed
earl on, and reasons why the semantics will not change anytime soon have
been given.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Andrew

On Sunday, October 28, 2012 9:26:01 PM UTC-7, Ian wrote:
 On Sun, Oct 28, 2012 at 10:00 PM,  Andrew wrote:
 
  Hi Ian,
 
  Well, no it really isn't equivalent.
 
  Consider a programmer who writes:
 
  xrange(-4,3) *wants* [-4,-3,-2,-1,0,1,2]
 
 
 
  That is the idea of a range; for what reason would anyone *EVER* want -4 
  to +3 to be 6:3???
 
 
 
 That is what ranges do, but your question was about slices, not ranges.

Actually, I said in the OP:

I also don't understand why slice() is not equivalent to an iterator, but can 
replace an integer in __getitem__() whereas xrange() can't.

=

Thank you for the code snippet; I don't think it likely that existing programs 
depend on nor use a negative index and a positive index expecting to take a 
small chunk in the center... hence, I would return the whole array; Or if 
someone said [-len(listX) : len(listX)+1 ] I would return the whole array twice.
That's the maximum that is possible.
If someone could show me a normal/reasonable script which *would* expect the 
other behavior, I'd like to know; compatibility is important.

=

My intended inferences about the iterator vs. slice question was perhaps not 
obvious to you; Notice: an iterator is not *allowed* in __getitem__().

The slice class when passed to __getitem__()  was created to merely pass two 
numbers and a stride to __getitem__;  As far as I know slice() itself does 
*nothing* in the actual processing of the elements.  So, it's *redundant* 
functionality, and far worse, it's restrictive.

The philosophy of Python is to have exactly one way to do something when 
possible; so, why create a stand alone class that does nothing an existing 
class could already do, and do it better ?

A simple list of three values would be just as efficient as slice()!
xrange is more flexible, and can be just as efficient.

So, Have I misunderstood the operation of slice()?  I think I might have... but 
I don't know.

In 'C', where Python is written, circularly linked lists -- and arrays are both 
very efficient ways of accessing data.  Arrays can, in fact, have negative 
indexes -- perhaps contrary to what you thought.  One merely defines a variable 
to act as the base pointer to the array and initialize it to the *end* of the 
array. Nor is the size of the data elements an issue, since in Python all 
classes are accessed by pointers which are of uniform size. I routinely do this 
in C.

Consider, also, that xrange() does not actually create a list -- but merely an 
iterator generating integers which is exactly what __getitem__ works on.
So, xrange() does not need to incur a memory or noticeable time penalty.

From micro-python, it's clear that their implementation of xrange() is at the 
'C' level; which is extremely fast.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread andrewr3mail

On Sunday, October 28, 2012 10:14:03 PM UTC-7, Paul Rubin wrote:
 Andrew writes:
 
  So: Why does python choose to convert them to positive indexes, and
 
  have slice operate differently than xrange 
 
 
 
 There was a thread a few years back, I think started by Bryan Olson,
 
 that made the case that slice indexing is a Python wart for further
 
 reasons than the above, and suggesting a notation like x[$-5] to denote
 
 what we now call x[-5] (i.e. $ is the length of the string).  So your
 
 example x[$-4:3] would clearly be the same thing as x[6:3] and not give
 
 any suggestion that it might wrap around.

I'm getting very frustrated with the editor provided for this group... It keeps 
posting prematurely, and putting my email in even when I tell it not to each 
time; and there is no way to edit a post... but deleting is ok...

I think Olson makes a good point.  The len() operator is so ubiquitous that it 
would be very useful to have a shorthand like that.

I'll have to look for his thread.

I'm thinking that I might just patch my version of Python 3.x, in C, to allow 
iterators to be passed to __getitem__; I haven't ever seen someone wanting to 
use mixed sign indexes to extract a small chunk of an array in the middle; so I 
don't think my patch will break existing code.

The snippets of code given by other posters in the thread might also be used to 
make a compatibility wrapper; I'll have to study it closer; so that distributed 
code would still work on unpatched python, albeit much slower.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Chris Rebert

On Mon, Oct 29, 2012 at 12:54 AM, Andrew andrewr3m...@gmail.com wrote:
 On Sunday, October 28, 2012 9:26:01 PM UTC-7, Ian wrote:
 On Sun, Oct 28, 2012 at 10:00 PM,  Andrew wrote:
snip
 The slice class when passed to __getitem__()  was created to merely pass two 
 numbers and a stride to __getitem__;  As far as I know slice() itself does 
 *nothing* in the actual processing of the elements.  So, it's *redundant* 
 functionality, and far worse, it's restrictive.

 The philosophy of Python is to have exactly one way to do something when 
 possible; so, why create a stand alone class that does nothing an existing 
 class could already do, and do it better ?

 A simple list of three values would be just as efficient as slice()!
 xrange is more flexible, and can be just as efficient.

 So, Have I misunderstood the operation of slice()?  I think I might have... 
 but I don't know.

`slice` is intentionally lenient about the types of the start, stop, and step:
 class Foo:
... def __getitem__(self, slice_):
... print(slice_)
... return 42
...
 Foo()[a:b:c]
slice('a', 'b', 'c')
42

Thus, the thing being sliced is free to interpret the parts of the
slice however it wishes; hence, slice() is unable to contain the
processing you speak of.
By contrast, xrange() limits itself to integers.
To support the more general case, the slice syntax thus produces a
`slice` rather than an `xrange`.
Doubtlessly, there are also historical issues involved. As implied by
the ugliness of its name, `xrange` was added to the language
relatively later.

Cheers,
Chris
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread andrewr3mail

On Sunday, October 28, 2012 9:44:56 PM UTC-7, alex23 wrote:
 On Oct 29, 2:09 pm, Andrew andrewr3m...@gmail.com wrote:
 
  I use this arbitrary range code *often* so I need a general purpose 
  solution.
 
  I looked up slice() but the help is of no use, I don't even know how I might
 
  overload it to embed some logic to concatenate ranges of data; nor even if
 
  it is possible.
 
 
 
 Slices are passed in if provided to __getitem__/__setitem__/
 
 __delitem__, so you'd need to override it at the list level:
 
 
 
 class RangedSlicer(list):
 
 def __getitem__(self, item):
 
 # map item.start, .stop and .step to your own semantics
 
 
 
 Then wrap your lists with your RangedSlicer class as needed.

Hmmm...

I began a test in an interactive shell:
 class RangedSlicer(list):
... def __getitem__(self,item):
... print item
... 
 a=[1,2,3,4,5]
 a.__getitem__( slice(1,5) )
[2, 3, 4, 5]

Very odd...  I would have expected [1,2,3,4]

 a.__getitem__( slice(1,8) )
[2, 3, 4, 5]

So, slice() somehow was truncated although it ought to have been executed 
first, and passed to __getitem__() before __getitem__ could affect it.
That requires some tricky programming!

Not only that, but,
a.__getitem__( xrange[1,8] )
Causes an exception before the __getitem__ shadowing received it.

I don't see how I can over-ride it with your suggestion, but that's very 
inconsistent for your idea seems to be normal python that would work for 
user defined classes.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Chris Rebert

On Mon, Oct 29, 2012 at 1:08 AM,  andrewr3m...@gmail.com wrote:
 On Sunday, October 28, 2012 10:14:03 PM UTC-7, Paul Rubin wrote:
 Andrew writes:
snip
 I'm getting very frustrated with the editor provided for this group... It 
 keeps posting prematurely, and putting my email in even when I tell it not to 
 each time; and there is no way to edit a post... but deleting is ok...

This is a Usenet newsgroup[1], not a web forum. There are noteworthy
differences between the two.
FWICT, you happen to be accessing us via Google Groups, which is
widely acknowledged to suck. We are not hosted *by* Google Groups;
they just happen to carry our posts.
Personally, I'd suggest using our mailing list mirror instead:
http://mail.python.org/mailman/listinfo/python-list
Or use some other, better newsgroup provider that also carries us.

[1]: http://en.wikipedia.org/wiki/Usenet

Regards,
Chris
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Chris Rebert

On Mon, Oct 29, 2012 at 1:24 AM,  andrewr3m...@gmail.com wrote:
 On Sunday, October 28, 2012 9:44:56 PM UTC-7, alex23 wrote:
 On Oct 29, 2:09 pm, Andrew andrewr3m...@gmail.com wrote:
snip
 class RangedSlicer(list):
snip
 Then wrap your lists with your RangedSlicer class as needed.

 Hmmm...

 I began a test in an interactive shell:
 class RangedSlicer(list):
 ... def __getitem__(self,item):
 ... print item
 …

This just defines a class; it doesn't modify in-place the normal
behavior of plain lists. You have to actually *use* the class.

 a=[1,2,3,4,5]

You never wrapped `a` in a RangedSlicer or otherwise made use of RangedSlicer!
You wanted:
a = RangedSlicer([1,2,3,4,5])

 a.__getitem__( slice(1,5) )
 [2, 3, 4, 5]

 Very odd...  I would have expected [1,2,3,4]

[2, 3, 4, 5] is the return value from `a.__getitem__( slice(1,5) )`
(or, equivalently, from `[1,2,3,4,5][1:5]`). It is not the result of
print item; that line of code is never executed since you never used
the RangedSlicer class at all.

Regards,
Chris
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread andrewr3mail

On Monday, October 29, 2012 1:38:04 AM UTC-7, Chris Rebert wrote:
 On Mon, Oct 29, 2012 at 1:24 AM, 
 
  On Sunday, October 28, 2012 9:44:56 PM UTC-7, alex23 wrote:
 
  On Oct 29, 2:09 pm, Andrew  wrote:
 
 You never wrapped `a` in a RangedSlicer or otherwise made use of RangedSlicer!
 
 You wanted:
 
 a = RangedSlicer([1,2,3,4,5])
 
 
 
  a.__getitem__( slice(1,5) )
 
  [2, 3, 4, 5]
 
 
 
  Very odd...  I would have expected [1,2,3,4]
 
 
 
 [2, 3, 4, 5] is the return value from `a.__getitem__( slice(1,5) )`
 
 (or, equivalently, from `[1,2,3,4,5][1:5]`). It is not the result of
 
 print item; that line of code is never executed since you never used
 
 the RangedSlicer class at all.
 
 
 
 Regards,
 
 Chris

My apology --- I deleted that post; yet it didn't delete... I saw my mistake 
seconds after posting.

* gmail.

Note: I subscribed to the python-list, and am able to recieve e-mails, but I 
don't see how to write a post for this particular thread nor subscribe to this 
particular thread...

A brief suggestion, or link to a howto would be *much* appreciated.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Mark Lawrence


On 29/10/2012 08:59, andrewr3m...@gmail.com wrote:


Note: I subscribed to the python-list, and am able to recieve e-mails, but I 
don't see how to write a post for this particular thread nor subscribe to this 
particular thread...

A brief suggestion, or link to a howto would be *much* appreciated.



Get yourself a decent email client.  I read all the Python lists that 
I'm interested in using Thunderbird on Windows via gmane.


--
Cheers.

Mark Lawrence.

--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Andrew Robinson


Ok, hopefully this is better.  I love my own e-mail editor...

I can see that the slice() function can pass in arbitrary arguments.
I'm not sure for lists, which is what the range is applied to, why an 
argument like a would be part of a slice.
I *really* don't see what the advantage of a slice class is over a mere 
list in the order of start, stop, step eg: [ 1,4,9 ]


In a dictionary, where a could be a key -- I wasn't aware that there 
was a defined order that the idea of slice could apply to.


When I look at the documentation,
http://www.python.org/doc//current/c-api/slice

The only thing that slice has which is special, is that the the length 
of the sequence can be given -- and the start and stop index are either 
trimmed or an error (exception???) is thrown.


Where is the information on the more general case of slice()? :-\

I am thinking, can one use the 'super' type of access, to override -- 
within the list object itself -- the __getitem__ method, and after 
pre-processing -- call the shadowed method with the modified 
parameters?  That would allow me to use the normal a[-4:6] notation, 
without having to write a wrapper class that must be explicitly called.


I'm thinking something like,

PyListObject.__getitem__= lambda self, slice: 

--Andrew.



--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Mark Lawrence


On 29/10/2012 02:31, Andrew Robinson wrote:

Ok, hopefully this is better.  I love my own e-mail editor...

I can see that the slice() function can pass in arbitrary arguments.
I'm not sure for lists, which is what the range is applied to, why an
argument like a would be part of a slice.
I *really* don't see what the advantage of a slice class is over a mere
list in the order of start, stop, step eg: [ 1,4,9 ]

In a dictionary, where a could be a key -- I wasn't aware that there
was a defined order that the idea of slice could apply to.

When I look at the documentation,
http://www.python.org/doc//current/c-api/slice

The only thing that slice has which is special, is that the the length
of the sequence can be given -- and the start and stop index are either
trimmed or an error (exception???) is thrown.

Where is the information on the more general case of slice()? :-\

I am thinking, can one use the 'super' type of access, to override --
within the list object itself -- the __getitem__ method, and after
pre-processing -- call the shadowed method with the modified
parameters?  That would allow me to use the normal a[-4:6] notation,
without having to write a wrapper class that must be explicitly called.

I'm thinking something like,

PyListObject.__getitem__= lambda self, slice: 

--Andrew.



I suggest that you go back and read the tutorial about slicing.  I say 
this because we've started with negative array indicies and slice() (but 
Python arrays haven't been mentioned :), then moved onto (x)range and 
now lists, dictionaries and the C API for slices.


An alternative is to tell us precisely what you're trying to achieve. 
The odds are that there's a simple answer waiting in the wings for a 
simple question.


--
Cheers.

Mark Lawrence.

--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Steven D'Aprano

On Mon, 29 Oct 2012 01:59:06 -0700, andrewr3mail wrote:

 Note: I subscribed to the python-list, and am able to recieve e-mails,
 but I don't see how to write a post for this particular thread nor
 subscribe to this particular thread...

The beauty of email is that you don't have to subscribe to a thread. Once 
you subscribe to the mailing list, email is delivered into your inbox. To 
reply to it, just reply to it. To ignore it, throw it in the trash.

Gmail should have a button or three that say Reply to email or similar. 
You want the button that says Reply to All or Reply to List. Make 
sure that the reply includes python-list@python.org as a recipient.

Delete bits of the quoted email (the lines that start with  characters) 
that are no longer relevant to the conversation. Type your reply. Double 
check that the reply is going to python-list. Then hit Send.

(P.S. when you signed up for python-list@python.org, if you selected the 
option to receive a single daily digest instead of individual emails, 
you're going to have a bad time. Do yourself a favour -- and the rest of 
us -- and change back to individual emails.)



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Steven D'Aprano

On Mon, 29 Oct 2012 00:54:29 -0700, Andrew wrote:

 Actually, I said in the OP:
 
 I also don't understand why slice() is not equivalent to an iterator,
 but can replace an integer in __getitem__() whereas xrange() can't.

Slices and iterators have different purposes and therefore have not been 
made interchangeable. Yes, there are certain similarities between a slice 
and xrange, but there are also significant differences.


 Thank you for the code snippet; I don't think it likely that existing
 programs depend on nor use a negative index and a positive index
 expecting to take a small chunk in the center... 

On the contrary. That is the most straightforward and useful idea of 
slicing, to grab a contiguous slice of items.

Why would you want to grab a slice from the end of the list, and a slice 
from the start of the list, and swap them around? Apart from simulating 
card shuffles and cuts, who does that?


 hence, I would return
 the whole array; Or if someone said [-len(listX) : len(listX)+1 ] I
 would return the whole array twice.

Well, that's one possible interpretation, but it is not the one that 
Python uses. When you create your own language, you can choose whatever 
interpretation seems most useful to you too.



 That's the maximum that is possible.
 If someone could show me a normal/reasonable script which *would* expect
 the other behavior, I'd like to know; compatibility is important.

I'm not entirely sure I understand what you are asking here. 


 My intended inferences about the iterator vs. slice question was perhaps
 not obvious to you; Notice: an iterator is not *allowed* in
 __getitem__().

Actually, you can write __getitem__ for your own classes to accept 
anything you like.

py class Test:
... def __getitem__(self, index):
... return index
...
py t = Test()
py t[Hello world]
'Hello world'
py t[{'x': None}]
{'x': None}


 The slice class when passed to __getitem__()  was created to merely pass
 two numbers and a stride to __getitem__;  As far as I know slice()
 itself does *nothing* in the actual processing of the elements.  So,
 it's *redundant* functionality, and far worse, it's restrictive.

You say that as if it were a bad thing.


 The philosophy of Python is to have exactly one way to do something when
 possible; so, why create a stand alone class that does nothing an
 existing class could already do, and do it better ?

What existing class is that? It certainly isn't xrange.

Because xrange represents a concrete sequence of numbers, all three of 
start, end and stride must be concrete, known, integers:

py xrange(4, None, 2)
Traceback (most recent call last):
  File stdin, line 1, in module
TypeError: an integer is required

Whereas slices can trivially include blanks that get filled in only when 
actually used:

py hello world[aslice]
'owrd'
py NOBODY expects the Spanish Inquisition![aslice]
'D xet h pns nusto!'


So, no, xrange is no substitute for slices. Not even close.


 A simple list of three values would be just as efficient as slice()!

On the contrary, a simple list of three values not only could not do 
everything a slice does, but it's over twice the size!

py sys.getsizeof([1, 2, 3])
44
py sys.getsizeof(slice(1, 2, 3))
20


 xrange is more flexible, and can be just as efficient.

Less flexible, less efficient.


[snip]
 In 'C', where Python is written, 

That's a popular misapprehension. Python is written in Java, or Lisp, or 
Haskell, or CLR (dot Net), or RPython, or Ocaml, or Parrot. Each of those 
languages have, or had, at least one Python implementation. Oh, there's 
also a version written in C, or so I have heard.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Chris Angelico

On Mon, Oct 29, 2012 at 10:19 PM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 In 'C', where Python is written,

 That's a popular misapprehension. Python is written in Java, or Lisp, or
 Haskell, or CLR (dot Net), or RPython, or Ocaml, or Parrot. Each of those
 languages have, or had, at least one Python implementation. Oh, there's
 also a version written in C, or so I have heard.

And that's not including the human-brain implementation, perhaps the
most important of all. Although the current port of Python to my brain
isn't quite a complete implementation, lacking a few bits that I
should probably get to at some point, but even so, it's as useful to
me as firing up IDLE.

I wonder if what the OP is looking for is not slicing, but something
more akin to map. Start with a large object and an iterator that
produces keys, and create an iterator/list of their corresponding
values. Something like:

a=[1,2,3,4,5,6,7,8,9,10]
b=[a[i] for i in xrange(-4,3)]

It's not strictly a slice operation, but it's a similar sort of thing,
and it can do the wraparound quite happily.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Andrew Robinson


On 10/29/2012 04:32 AM, Chris Angelico wrote:
I wonder if what the OP is looking for is not slicing, but something 
more akin to map. Start with a large object and an iterator that 
produces keys, and create an iterator/list of their corresponding 
values. Something like: a=[1,2,3,4,5,6,7,8,9,10] b=[a[i] for i in 
xrange(-4,3)] It's not strictly a slice operation, but it's a similar 
sort of thing, and it can do the wraparound quite happily. ChrisA 


A list comprehension ?
That does do what I am interested in, *very* much so.  Quite a gem, Chris!

:-\
I am curious as to how quickly it constructs the result compared to a 
slice operation.


Eg:
a[1:5]
vs.
[ a[i] for i in xrange[1:5] ]

But, unless it were grossly slower -- so that if/then logic and slices 
were generally faster -- I will use it.

Thanks.

--Andrew.
--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Chris Angelico

On Mon, Oct 29, 2012 at 3:52 PM, Andrew Robinson
andr...@r3dsolutions.com wrote:
 I am curious as to how quickly it constructs the result compared to a slice
 operation.

 Eg:
 a[1:5]
 vs.
 [ a[i] for i in xrange[1:5] ]

For the most part, don't concern yourself with performance. Go with
functionality and readability. In the trivial case shown here, the
slice is WAY clearer, so it should definitely be the one used; in
other cases, the slice might simply be insufficient, so you go with
whatever achieves your goal. Performance will usually be good
enough, even if there's a marginally faster way.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Andrew Robinson


On 10/29/2012 04:19 AM, Steven D'Aprano wrote:

On Mon, 29 Oct 2012 00:54:29 -0700, Andrew wrote:

Slices and iterators have different purposes and therefore have not been
made interchangeable. Yes, there are certain similarities between a slice
and xrange, but there are also significant differences.

Aha, now were getting to the actual subject.


[snip]

In 'C', where Python is written,

That's a popular misapprehension. Python is written in Java, or Lisp, or
Haskell, or CLR (dot Net), or RPython, or Ocaml, or Parrot. Each of those
languages have, or had, at least one Python implementation. Oh, there's
also a version written in C, or so I have heard.

:-P
I didn't say it was only written in C,  but in C where it is 
implemented.
I will be porting Python 3.xx to a super low power embedded processor 
(MSP430), both space and speed are at a premium.
Running Python on top of Java would be a *SERIOUS* mistake.  .NET won't 
even run on this system. etc.






Thank you for the code snippet; I don't think it likely that existing
programs depend on nor use a negative index and a positive index
expecting to take a small chunk in the center...

On the contrary. That is the most straightforward and useful idea of
slicing, to grab a contiguous slice of items.
Show me an example where someone would write a slice with a negative and 
a positive index (both in the same slice);
and have that slice grab a contiguous slice in the *middle* of the list 
with orientation of lower index to greater index.
I have asked before;  It's not that I don't think it possible -- it's 
that I can't imagine a common situation.



Why would you want to grab a slice from the end of the list, and a slice
from the start of the list, and swap them around? Apart from simulating
card shuffles and cuts, who does that?
Advanced statistics programmers using lookup tables that are 
symmetrical.  Try Physicists too -- but they're notably weird.



My intended inferences about the iterator vs. slice question was perhaps
not obvious to you; Notice: an iterator is not *allowed* in
__getitem__().

Actually, you can write __getitem__ for your own classes to accept
anything you like.

Yes, I realize that.
But, why can't I just overload the existing __getitem__ for lists and 
not bother writing an entire class?
Everything in Python is supposed to be an object and one of the big 
supposed selling points is the ability to overload any object's methods.
The lists aren't special -- they're just a bunch of constant decimal 
numbers, typically given as a large tuple.




py  class Test:
... def __getitem__(self, index):
... return index
...

Better:
 class Test:
... def __getitem__( self, *index ):
... return index

No extra curlies required...


You say that as if it were a bad thing.


hmmm... and you as if sarcastic? :-)
It is a bad thing to have any strictly un-necessary and non-code saving 
objects where memory is restricted.



What existing class is that? It certainly isn't xrange.

Because xrange represents a concrete sequence of numbers, all three of
start, end and stride must be concrete, known, integers:



Let's up the ante.  I'll admit xrange() won't do later fill in the 
blank -- BUT --

xrange() is a subclass of an *existing* class called iterator.
Iterators are very general.  They can even be made random.


The philosophy of Python is to have exactly one way to do something when
possible; so, why create a stand alone class that does nothing an
existing class could already do, and do it better ?

py  xrange(4, None, 2)
Traceback (most recent call last):
   File stdin, line 1, inmodule
TypeError: an integer is required


Hmmm..
Let's try your example exactly as shown...

hello world[aslice]
Traceback (most recent call last):
  File stdin, line 1, in module
NameError: name 'aslice' is not defined

WOW. Cool.
Where did the blanks *actually* get filled in?  Or HOW WILL they in your 
next post?


On the contrary, a simple list of three values not only could not do 
everything a slice does, but it's over twice the size! 
Yeah, There is a definite issue there.  But the case isn't decided by 
that number alone.

A slice is storing three integers -- and an integer is size is 12.
So, slices don't use integers.  If the type that *IS* used happens to be 
a real Python type, we may merely typecast integers to that type -- 
insert them in a tuple and by definition, they must be the same size.


 Looking at some of the online programming notes -- a slice apparently 
doesn't use an integer storage variable that is capable of arbitrary 
expansion. =-O -- and hence, won't work for very large sized lists.  
That actually explains some crashes I have noted in the past when 
working with 20 million element lists that I wanted a slice of.  I had 
*plenty* of ram on that system.
Besides: The program code to implement slice() is undoubtedly larger 
than 12 bytes of savings!

How many slices() are typically found in memory simultaneously?

Re: Negative array indicies and slice()

2012-10-29 Thread Chris Angelico

On Mon, Oct 29, 2012 at 5:01 PM, Andrew Robinson
andr...@r3dsolutions.com wrote:
  Looking at some of the online programming notes -- a slice apparently
 doesn't use an integer storage variable that is capable of arbitrary
 expansion. =-O -- and hence, won't work for very large sized lists.  That
 actually explains some crashes I have noted in the past when working with 20
 million element lists that I wanted a slice of.  I had *plenty* of ram on
 that system.

Can you provide links to these notes? I'm looking at
cpython/Include/sliceobject.h that has this comment:

/*

A slice object containing start, stop, and step data members (the
names are from range).  After much talk with Guido, it was decided to
let these be any arbitrary python type.  Py_None stands for omitted values.
*/

Also, the code for slice objects in CPython works with Py_ssize_t (a
signed quantity of the same length as size_t), which will allow at
least 2**31 for an index. I would guess that your crashes were nothing
to do with 20 million elements and slices.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Roy Smith

In article mailman.3009.1351516065.27098.python-l...@python.org,
 Andrew Robinson andr...@r3dsolutions.com wrote:

 Show me an example where someone would write a slice with a negative and 
 a positive index (both in the same slice);
 and have that slice grab a contiguous slice in the *middle* of the list 
 with orientation of lower index to greater index.

It's possible in bioinformatics.  Many organisms have circular 
chromosomes.  It's a single DNA molecule spliced into a loop.  There's 
an origin, but it's more a convenience thing for people to assign some 
particular base-pair to be location 0.  From the organism's point of 
view, the origin isn't anything special (but there *is* a fixed 
orientation).

It's entirely plausible for somebody to want to extract the sub-sequence 
from 100 bp (base-pairs) before the origin to 100 bp after the origin.  
If you were storing the sequence in Python string (or list), the most 
convenient way to express this would be seq[-100:100].  Likewise, if you 
wanted the *other* fragment, you would write seq[100:-100].

There is a minor confounding factor here in that biologists number 
sequences starting with 1, not 0.  At least that was the way when I was 
doing this stuff mumble years ago.  I don't know what the current 
convention is.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Ian Kelly

On Oct 29, 2012 7:10 AM, Andrew Robinson andr...@r3dsolutions.com wrote:
 I will be porting Python 3.xx to a super low power embedded processor 
 (MSP430), both space and speed are at a premium.
 Running Python on top of Java would be a *SERIOUS* mistake.  .NET won't even 
 run on this system. etc.

If that's the case, then running Python at all is probably a mistake.
You know the interpreter alone has an overhead of nearly 6 MB?

 Yes, I realize that.
 But, why can't I just overload the existing __getitem__ for lists and not 
 bother writing an entire class?

You can just overload that one method in a subclass of list.  Being
able to monkey-patch __getitem__ for the list class itself would not
be advisable, as it would affect all list slicing anywhere in your
program and possibly lead to some unexpected behaviors.

 Hmmm..
 Let's try your example exactly as shown...

 hello world[aslice]

 Traceback (most recent call last):
   File stdin, line 1, in module
 NameError: name 'aslice' is not defined

 WOW. Cool.
 Where did the blanks *actually* get filled in?  Or HOW WILL they in your next 
 post?

It appears that Steven omitted the definition of aslice by mistake.
It looks like it should have been:

aslice = slice(4, None, 2)

  Looking at some of the online programming notes -- a slice apparently 
 doesn't use an integer storage variable that is capable of arbitrary 
 expansion. =-O -- and hence, won't work for very large sized lists.  That 
 actually explains some crashes I have noted in the past when working with 20 
 million element lists that I wanted a slice of.  I had *plenty* of ram on 
 that system.

20 million is nothing.  On a 32-bit system, sys.maxsize == 2 ** 31 -
1.  If the error you were seeing was MemoryError, then more likely you
were running into dynamic allocation issues due to fragmentation of
virtual memory.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Ian Kelly

On Mon, Oct 29, 2012 at 1:54 AM, Andrew andrewr3m...@gmail.com wrote:
 My intended inferences about the iterator vs. slice question was perhaps not 
 obvious to you; Notice: an iterator is not *allowed* in __getitem__().

Yes, I misconstrued your question.  I thought you wanted to change the
behavior of slicing to wrap around the end when start  stop instead
of returning an empty sequence.  What you actually want is a new
sequence built from indexes supplied by an iterable.  Chris has
already given you a list comprehension solution to solve that.  You
could also use map for this:

new_seq = list(map(old_seq.__getitem__, iterable))

Since you seem to be concerned about performance, I'm not sure in this
case whether the map or the list comprehension will be faster.  I'll
leave you to test that on your intended hardware.

 In 'C', where Python is written, circularly linked lists -- and arrays are 
 both very efficient ways of accessing data.  Arrays can, in fact, have 
 negative indexes -- perhaps contrary to what you thought.  One merely defines 
 a variable to act as the base pointer to the array and initialize it to the 
 *end* of the array. Nor is the size of the data elements an issue, since in 
 Python all classes are accessed by pointers which are of uniform size. I 
 routinely do this in C.

I'm aware of what is possible in C with pointer arithmetic.  This is
Python, though, and Python by design has neither pointers nor pointer
arithmetic.  In any case, initializing the pointer to the end of the
array would still not do what you want, since the positive indices
would then extend past the end of the array.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Steven D'Aprano

On Mon, 29 Oct 2012 23:40:53 +1100, Chris Angelico wrote:

 On Mon, Oct 29, 2012 at 3:52 PM, Andrew Robinson
 andr...@r3dsolutions.com wrote:
 I am curious as to how quickly it constructs the result compared to a
 slice operation.

 Eg:
 a[1:5]
 vs.
 [ a[i] for i in xrange[1:5] ]
 
 For the most part, don't concern yourself with performance. Go with
 functionality and readability. In the trivial case shown here, the slice
 is WAY clearer, so it should definitely be the one used; in other cases,
 the slice might simply be insufficient, so you go with whatever achieves
 your goal. Performance will usually be good enough, even if there's a
 marginally faster way.


Slicing is about an order of magnitude faster:


[steve@ando ~]$ python2.7 -m timeit -s x = range(100, 1000, 2) x
[20:40]
100 loops, best of 3: 0.342 usec per loop
[steve@ando ~]$ python2.7 -m timeit -s x = range(100, 1000, 2) [x[i] 
for i in xrange(20, 40)]
10 loops, best of 3: 3.43 usec per loop


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Steven D'Aprano

On Mon, 29 Oct 2012 11:19:38 +, Steven D'Aprano wrote:

 Because xrange represents a concrete sequence of numbers, all three of
 start, end and stride must be concrete, known, integers:
 
 py xrange(4, None, 2)
 Traceback (most recent call last):
   File stdin, line 1, in module
 TypeError: an integer is required
 
 Whereas slices can trivially include blanks that get filled in only when
 actually used:
 
 py hello world[aslice]
 'owrd'
 py NOBODY expects the Spanish Inquisition![aslice] 
 'D xet h pns nusto!'

/me facepalms/


Argggh, I forgot to copy-and-paste the critical line defining aslice:

aslice = slice(4, None, 2)


Sorry about that.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Andrew Robinson


On 10/29/2012 06:52 AM, Roy Smith wrote:

Show me an example where someone would write a slice with a negative and
a positive index (both in the same slice);
and have that slice grab a contiguous slice in the *middle* of the list
with orientation of lower index to greater index.
It's possible in bioinformatics. ...
eq[100:-100].
I decided to go to bed... I was starting to write very badly worded 
responses. :)


Thanks, Roy,  what you have just shown is another example that agrees 
with what I am trying to do.
FYI: I was asking for a reason why Python's present implementation is 
desirable...


I wonder, for example:

Given an arbitrary list:
a=[1,2,3,4,5,6,7,8,9,10,11,12]

Why would someone *want* to do:
a[-7,10]
Instead of saying
a[5:10] or a[-7:-2] ?

eg:
What algorithm would naturally *desire* the default behavior of slicing 
when using *mixed* negative and positive indexes?
In the case of a bacterial circular DNA/RNA ring, asking for codons[ 
-10: 10 ]  would logically desire codon[-10:] + codon[:10] not an empty 
list, right?


I think your example is a very reasonable thing the scientific community 
would want to do with Python.

:)

--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Andrew Robinson


On 10/29/2012 10:09 AM, Ian Kelly wrote:

On Oct 29, 2012 7:10 AM, Andrew Robinsonandr...@r3dsolutions.com  wrote:

I will be porting Python 3.xx to a super low power embedded processor (MSP430), 
both space and speed are at a premium.
Running Python on top of Java would be a *SERIOUS* mistake.  .NET won't even 
run on this system. etc.

If that's the case, then running Python at all is probably a mistake.
You know the interpreter alone has an overhead of nearly 6 MB?
There's already a version of the python interpreter which fits in under 
100K:

http://code.google.com/p/python-on-a-chip/
It's not the 3.x series, though; and I don't want to redo this once 2.7 
really does become obsolete.

Yes, I realize that.
But, why can't I just overload the existing __getitem__ for lists and not 
bother writing an entire class?

You can just overload that one method in a subclass of list.  Being
able to monkey-patch __getitem__ for the list class itself would not
be advisable, as it would affect all list slicing anywhere in your
program and possibly lead to some unexpected behaviors.

That's what I am curious about.
What unexpected behaviors would a monkey patch typically cause?
If no one really uses negative and positive indexes in the same slice 
operation, because there is no reason to do so...

It will only break the occasional esoteric application.



20 million is nothing.  On a 32-bit system, sys.maxsize == 2 ** 31 -
1.  If the error you were seeing was MemoryError, then more likely you
were running into dynamic allocation issues due to fragmentation of
virtual memory.




No, there was no error at all.  Pthon just crashed  exited; not even an 
exception that I can recall.   It was if it exited normally!


The list was generated in a single pass by many .append() 's, and then 
copied once -- the original was left in place; and then I attempted to 
slice it.


I am able to routinely to 5 million length lists, copy, slice, cut, 
append, and delete from them without this ever happening.
If fragmentation were the issue, I'd think the shorter lists would cause 
the problem after many manipulations...


It may not be a bug in python itself, though, of course.  There are 
libraries it uses which might have a bug.


--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Chris Angelico

On Tue, Oct 30, 2012 at 2:42 AM, Andrew Robinson
andr...@r3dsolutions.com wrote:
 No, there was no error at all.  Pthon just crashed  exited; not even an
 exception that I can recall.   It was if it exited normally!

Can you create a reproducible test case? There's usually a cause to
these sorts of things.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Ian Kelly

On Mon, Oct 29, 2012 at 9:20 AM, Andrew Robinson
andr...@r3dsolutions.com wrote:
 FYI: I was asking for a reason why Python's present implementation is
 desirable...

 I wonder, for example:

 Given an arbitrary list:
 a=[1,2,3,4,5,6,7,8,9,10,11,12]

 Why would someone *want* to do:
 a[-7,10]
 Instead of saying
 a[5:10] or a[-7:-2] ?

A quick search of local code turns up examples like this:

if name.startswith('{') and name.endswith('}'):
name = name[1:-1]

If slices worked like ranges, then the result of that would be empty,
which is obviously not desirable.

I don't know of a reason why one might need to use a negative start
with a positive stop, though.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Ian Kelly

On Mon, Oct 29, 2012 at 9:42 AM, Andrew Robinson
andr...@r3dsolutions.com wrote:
 The list was generated in a single pass by many .append() 's, and then
 copied once -- the original was left in place; and then I attempted to slice
 it.

Note that if the list was generated by .appends, then it was copied
more than once.  Python reserves a specific amount of space for the
list.  When it grows past that, the list must be reallocated and
copied.  It grows the list exponentially in order to keep the
amortized time complexity of append at O(1), but the point is that a
list of 20 million items is going to be resized and copied several
times before it is complete.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Roy Smith

In article mailman.3056.1351552107.27098.python-l...@python.org,
 Ian Kelly ian.g.ke...@gmail.com wrote:

 On Mon, Oct 29, 2012 at 9:42 AM, Andrew Robinson
 andr...@r3dsolutions.com wrote:
  The list was generated in a single pass by many .append() 's, and then
  copied once -- the original was left in place; and then I attempted to slice
  it.
 
 Note that if the list was generated by .appends, then it was copied
 more than once.  Python reserves a specific amount of space for the
 list.  When it grows past that, the list must be reallocated and
 copied.  It grows the list exponentially in order to keep the
 amortized time complexity of append at O(1), but the point is that a
 list of 20 million items is going to be resized and copied several
 times before it is complete.

I think you're missing the point of amortized constant time.  Yes, the 
first item appended to the list will be copied lg(20,000,000) ~= 25 
times, because the list will be resized that many times(*).  But, on 
average (I'm not sure if average is strictly the correct word here), 
each item will be copied only once.

Still, it always stuck me as odd that there's no preallocate() method.  
There are times when you really do know how many items you're going to 
add to the list, and doing a single allocation would be a win.  And it 
doesn't cost anything to provide it.  I suppose, however, if you're 
adding enough items that preallocating would really matter, then maybe 
you want an array instead.

(*) I don't know the exact implementation; I'm assuming each resize is a 
factor of 2.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Ian Kelly

On Mon, Oct 29, 2012 at 5:24 PM, Roy Smith r...@panix.com wrote:
 I think you're missing the point of amortized constant time.  Yes, the
 first item appended to the list will be copied lg(20,000,000) ~= 25
 times, because the list will be resized that many times(*).  But, on
 average (I'm not sure if average is strictly the correct word here),
 each item will be copied only once.

 Still, it always stuck me as odd that there's no preallocate() method.
 There are times when you really do know how many items you're going to
 add to the list, and doing a single allocation would be a win.  And it
 doesn't cost anything to provide it.  I suppose, however, if you're
 adding enough items that preallocating would really matter, then maybe
 you want an array instead.

 (*) I don't know the exact implementation; I'm assuming each resize is a
 factor of 2.

The growth factor is approximately 1.125.  Approximately because
there is also a small constant term.  The average number of copies per
item converges on 8.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Oscar Benjamin

On 29 October 2012 23:01, Ian Kelly ian.g.ke...@gmail.com wrote:
 On Mon, Oct 29, 2012 at 9:20 AM, Andrew Robinson
 andr...@r3dsolutions.com wrote:
 FYI: I was asking for a reason why Python's present implementation is
 desirable...

 I wonder, for example:

 Given an arbitrary list:
 a=[1,2,3,4,5,6,7,8,9,10,11,12]

 Why would someone *want* to do:
 a[-7,10]
 Instead of saying
 a[5:10] or a[-7:-2] ?

 A quick search of local code turns up examples like this:

 if name.startswith('{') and name.endswith('}'):
 name = name[1:-1]

 If slices worked like ranges, then the result of that would be empty,
 which is obviously not desirable.

 I don't know of a reason why one might need to use a negative start
 with a positive stop, though.

It's useful when getting a reversed slice:

 a = [1,2,3,4,5,6,7,8,9,10]
 a[-3:3:-1]
[8, 7, 6, 5]


Oscar
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Ian Kelly

On Mon, Oct 29, 2012 at 5:43 PM, Ian Kelly ian.g.ke...@gmail.com wrote:
 The growth factor is approximately 1.125.  Approximately because
 there is also a small constant term.  The average number of copies per
 item converges on 8.

Of course, that is the *maximum* number of copies.  The actual number
could be much less if realloc() performs well.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Steven D'Aprano

On Mon, 29 Oct 2012 08:42:39 -0700, Andrew Robinson wrote:

 But, why can't I just overload the existing __getitem__ for lists and
 not bother writing an entire class?

You say that as if writing an entire class was a big complicated 
effort. It isn't. It is trivially simple, a single line:

class MyList(list):
...


plus the __getitem__ definition, which you would have to write anyway. It 
is a trivial amount of extra effort.


 You can just overload that one method in a subclass of list.  Being
 able to monkey-patch __getitem__ for the list class itself would not be
 advisable, as it would affect all list slicing anywhere in your program
 and possibly lead to some unexpected behaviors.

 That's what I am curious about.
 What unexpected behaviors would a monkey patch typically cause? 

What part of unexpected is unclear?

Monkey-patching is poor programming technique because it leads to 
*unexpected* and *impossible to predict* interactions between *distant* 
parts of the code. It leads to impossible to predict differences between 
the source code on disk and the actual running code. It leads to 
impossible to predict differences between documented behaviour and actual 
behaviour.

Let me see if I can illustrate a flavour of the sort of things that can 
happen if monkey-patching built-ins were allowed.

You create a list and print it:

# simulated output
py x = [5, 2, 4, 1]
py print(x)
[1, 2, 4, 5]

What? How did that happen? That's not the list you provided. The order 
has been lost.

So you dig deep into your code, and you don't find anything. And you read 
the Python documentation for lists, and don't find anything. And you 
google the Internet, and don't find anything. And you ask for help, and 
everybody says you're crazy because when they duplicate your code they 
get the expected behaviour. And you report a bug in Python, and it gets 
closed as cannot replicate.

Finally you search deep into the libraries used in your code, and *five 
days later* discover that your code uses library A which uses library B 
which uses library C which uses library D which installs a harmless 
monkey-patch to print, but only if library E is installed, and you just 
happen to have E installed even though your code never uses it, AND that 
monkey-patch clashes with a harmless monkey-patch to list.__getitem__ 
installed by library F. And even though each monkey-patch alone is 
harmless, the combination breaks your code's output.



Python allows, but does not encourage, monkey-patching of code written in 
pure Python, because it sometimes can be useful. It flat out prohibits 
monkey-patching of builtins, because it is just too dangerous.

Ruby allows monkey-patching of everything. And the result was predictable:

http://devblog.avdi.org/2008/02/23/why-monkeypatching-is-destroying-ruby/


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Roy Smith

In article mailman.3057.1351554215.27098.python-l...@python.org,
 Ian Kelly ian.g.ke...@gmail.com wrote:

 On Mon, Oct 29, 2012 at 5:24 PM, Roy Smith r...@panix.com wrote:
  I think you're missing the point of amortized constant time.  Yes, the
  first item appended to the list will be copied lg(20,000,000) ~= 25
  times, because the list will be resized that many times(*).  But, on
  average (I'm not sure if average is strictly the correct word here),
  each item will be copied only once.
 
  Still, it always stuck me as odd that there's no preallocate() method.
  There are times when you really do know how many items you're going to
  add to the list, and doing a single allocation would be a win.  And it
  doesn't cost anything to provide it.  I suppose, however, if you're
  adding enough items that preallocating would really matter, then maybe
  you want an array instead.
 
  (*) I don't know the exact implementation; I'm assuming each resize is a
  factor of 2.
 
 The growth factor is approximately 1.125.  Approximately because
 there is also a small constant term.  The average number of copies per
 item converges on 8.

Wow, that's surprising.  It also makes it that much more surprising that 
there's no way to pre-allocate.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Andrew Robinson


On 10/29/2012 06:53 AM, Chris Angelico wrote:

Can you provide links to these notes? I'm looking at
cpython/Include/sliceobject.h that has this comment:

/*

A slice object containing start, stop, and step data members (the
names are from range).  After much talk with Guido, it was decided to
let these be any arbitrary python type.  Py_None stands for omitted values.
*/

Also, the code for slice objects in CPython works with Py_ssize_t (a
signed quantity of the same length as size_t), which will allow at
least 2**31 for an index. I would guess that your crashes were nothing
to do with 20 million elements and slices.

ChrisA
Let's look at the source code rather than the web notes -- the source 
must be the true answer anyhow.


I downloaded the source code for python 3.3.0, as the tbz;
In the directory Python-3.3.0/Python, look at Python-ast.c, line 2089 
 ff.


Clearly a slice is malloced for a slice_ty type.
It has four elements: kind, lower, upper, and step.

So, tracing it back to the struct definition...

Include/Python-ast.h  has typedef struct _slice *slice_ty;

And, here's the answer!:

enum _slice_kind {Slice_kind=1, ExtSlice_kind=2, Index_kind=3};
struct _slice {
enum _slice_kind kind;
union {
struct {
expr_ty lower;
expr_ty upper;
expr_ty step;
} Slice;

struct {
asdl_seq *dims;
} ExtSlice;

struct {
expr_ty value;
} Index;

} v;
};


So, slice() does indeed have arbitrary python types included in it; 
contrary to what I read elsewhere.
expr_ty is a pointer to an arbitrary expression, so the actual structure 
is 4 pointers, at 32 bits each = 16 bytes.
The size of the structure itself, given in an earlier post, is 20 bytes 
-- which means one more pointer is involved, perhaps the one pointing to 
the slice structure itself.


Hmm...!

An empty tuple gives sys.getsizeof( () ) = 24.

But, I would expect a tuple to be merely a list of object pointers; 
hence I would expect 4 bytes for len(), and then a head pointer 4 bytes, 
and then a pointer for each object.

3 objects gives 12 bytes, + 8 = 16 bytes.

Then we need one more pointer so Python knows where the struct is...
So a Tuple of 3 objects ought to fit nicely into 20 bytes; the same size 
as slice() --


but it's 24, even when empty...
And 36 when initialized...
What are the extra 16 bytes for?

All I see is:
typedef struct { object** whatever } PyTupleObject;







--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Chris Kaynor

On Mon, Oct 29, 2012 at 11:00 AM, Andrew Robinson
andr...@r3dsolutions.com wrote:

 Let's look at the source code rather than the web notes -- the source must
 be the true answer anyhow.

 I downloaded the source code for python 3.3.0, as the tbz;
 In the directory Python-3.3.0/Python, look at Python-ast.c, line 2089 
 ff.

 Clearly a slice is malloced for a slice_ty type.
 It has four elements: kind, lower, upper, and step.

 So, tracing it back to the struct definition...

 Include/Python-ast.h  has typedef struct _slice *slice_ty;

 And, here's the answer!:

 enum _slice_kind {Slice_kind=1, ExtSlice_kind=2, Index_kind=3};
 struct _slice {
 enum _slice_kind kind;
 union {
 struct {
 expr_ty lower;
 expr_ty upper;
 expr_ty step;
 } Slice;

 struct {
 asdl_seq *dims;
 } ExtSlice;

 struct {
 expr_ty value;
 } Index;

 } v;
 };


 So, slice() does indeed have arbitrary python types included in it; contrary
 to what I read elsewhere.
 expr_ty is a pointer to an arbitrary expression, so the actual structure is
 4 pointers, at 32 bits each = 16 bytes.
 The size of the structure itself, given in an earlier post, is 20 bytes --
 which means one more pointer is involved, perhaps the one pointing to the
 slice structure itself.

 Hmm...!

 An empty tuple gives sys.getsizeof( () ) = 24.

 But, I would expect a tuple to be merely a list of object pointers; hence I
 would expect 4 bytes for len(), and then a head pointer 4 bytes, and then a
 pointer for each object.
 3 objects gives 12 bytes, + 8 = 16 bytes.

 Then we need one more pointer so Python knows where the struct is...
 So a Tuple of 3 objects ought to fit nicely into 20 bytes; the same size as
 slice() --

 but it's 24, even when empty...
 And 36 when initialized...
 What are the extra 16 bytes for?

Every Python object requires two pieces of data, both of which are
pointer-sized (one is a pointer, one is an int the size of a pointer).
These are: a pointer to the object's type, and the object's reference
count.

A tuple actually does not need a head pointer: the head pointer is
merely an offset from the tuple's pointer. It merely has a ref count,
type, an item count, and pointers to its contents.

A slice has the same type pointer and reference count, then three
pointers to the start, stop, and step objects. This means a slice
object should be the same size as a two-item tuple: the tuple needs a
count, while that is fixed at 3 for a slice (though some items may be
unset).

NOTE: The above is taken from reading the source code for Python 2.6.
For some odd reason, I am getting that an empty tuple consists of 6
pointer-sized objects (48 bytes on x64), rather than the expected 3
pointer-sized (24 bytes on x64). Slices are showing up as the expected
5 pointer-sized (40 bytes on x64), and tuples grow at the expected 1
pointer (8 bytes on x64) per item. I imagine I am missing something,
but cannot figure out what that would be.


 All I see is:
 typedef struct { object** whatever } PyTupleObject;

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Andrew Robinson


On 10/29/2012 05:02 PM, Steven D'Aprano wrote:

On Mon, 29 Oct 2012 08:42:39 -0700, Andrew Robinson wrote:


But, why can't I just overload the existing __getitem__ for lists and
not bother writing an entire class?

You say that as if writing an entire class was a big complicated
effort. It isn't. It is trivially simple, a single line:

class MyList(list):
 ...
No, I don't think it big and complicated.  I do think it has timing 
implications which are undesirable because of how *much* slices are used.
In an embedded target -- I have to optimize; and I will have to reject 
certain parts of Python to make it fit and run fast enough to be useful.



You can just overload that one method in a subclass of list.  Being
able to monkey-patch __getitem__ for the list class itself would not be
advisable, as it would affect all list slicing anywhere in your program
and possibly lead to some unexpected behaviors.

That's what I am curious about.
What unexpected behaviors would a monkey patch typically cause?

What part of unexpected is unclear?

Ahh -- The I don't know approach!  It's only unexpected if one is a bad 
programmer...!

Let me see if I can illustrate a flavour of the sort of things that can
happen if monkey-patching built-ins were allowed.

You create a list and print it:

# simulated output
py  x = [5, 2, 4, 1]
py  print(x)
[1, 2, 4, 5]

snip

Finally you search deep into the libraries used in your code, and *five
days later* discover that your code uses library A which uses library B
which uses library C which uses library D which installs a harmless
monkey-patch to print, but only if library E is installed, and you just
happen to have E installed even though your code never uses it, AND that
monkey-patch clashes with a harmless monkey-patch to list.__getitem__
installed by library F. And even though each monkey-patch alone is
harmless, the combination breaks your code's output.

Right, which means that people developing the libraries made 
contradictory assumptions.



Python allows, but does not encourage, monkey-patching of code written in
pure Python, because it sometimes can be useful. It flat out prohibits
monkey-patching of builtins, because it is just too dangerous.

Ruby allows monkey-patching of everything. And the result was predictable:

http://devblog.avdi.org/2008/02/23/why-monkeypatching-is-destroying-ruby/



I read that post carefully; and the author purposely notes that he is 
exaggerating.

BUT Your point is still well taken.

What you are talking about is namespace preservation; and I am thinking 
about it. I can preserve it -- but only if I disallow true Python 
primitives in my own interpreter; I can't provide two sets in the memory 
footprint I am using.


From my perspective, the version of Python that I compile will not be 
supported by the normal python help; The predecessor which first forged 
this path, Pymite, has the same problems -- however, the benefits 
ought-weigh the disadvantages; and the experiment yielded useful 
information on what is redundant in Python (eg: range is not supported) 
and when that redundancy is important for some reason.


If someone had a clear explanation of the disadvantages of allowing an 
iterator, or a tuple -- in place of a slice() -- I would have no qualms 
dropping the subject.  However, I am not finding that yet.  I am finding 
very small optimization issues...


The size of an object is at least 8 bytes.  Hence, three numbers is 
going to be at least 24 bytes; and that's 24 bytes in *excess* of the 
size of slice() or tuple () which are merely containers.  So -- There 
*ARE* savings in memory when using slice(), but it isn't really 2x 
memory -- its more like 20% -- once the actual objects are considered.


The actual *need* for a slice() object still hasn't been demonsrated.  I 
am thinking that the implementation of __getitem__() is very poor 
probably because of legacy issues.


A tuple can also hold None, so ( 1, None, 2 ) is still a valid Tuple.
Alternately:  An iterator, like xrange(), could be made which takes None 
as a parameter, or a special value like 'inf'.
Since these two values would never be passed to xrange by already 
developed code, allowing them would not break working code.


I am only aware of one possible reason that slice() was once thought to 
be necessary; and that is because accessing the element of a tuple would 
recursively call __getitem__ on the tuple.  But, even that is easily 
dismissed once the fixed integer indexes are considered.


Your thoughts?  Do you have any show stopper insights?








--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Andrew Robinson


On 10/29/2012 06:49 PM, Chris Kaynor wrote:
Every Python object requires two pieces of data, both of which are 
pointer-sized (one is a pointer, one is an int the size of a pointer). 
These are: a pointer to the object's type, and the object's reference 
count. A tuple actually does not need a head pointer: the head pointer 
is merely an offset from the tuple's pointer. It merely has a ref 
count, type, an item count, and pointers to its contents. A slice has 
the same type pointer and reference count, then three pointers to the 
start, stop, and step objects. This means a slice object should be the 
same size as a two-item tuple: the tuple needs a count, while that is 
fixed at 3 for a slice (though some items may be unset). NOTE: The 
above is taken from reading the source code for Python 2.6. For some 
odd reason, I am getting that an empty tuple consists of 6 
pointer-sized objects (48 bytes on x64), rather than the expected 3 
pointer-sized (24 bytes on x64). Slices are showing up as the expected 
5 pointer-sized (40 bytes on x64), and tuples grow at the expected 1 
pointer (8 bytes on x64) per item. I imagine I am missing something, 
but cannot figure out what that would be.

All I see is:
typedef struct { object** whatever } PyTupleObject;

It's fairly straight forward in 3.2.0.  I debugged the code with GDB and 
watched.

Perhaps it is the same in 2.6 ?

In addition to those items you mention, of which the reference count is 
not even *inside* the struct -- there is additional debugging 
information not mentioned.  Built in objects contain a line number, a 
column number, and a context pointer.  These each require a full 
word of storage.


Also, built in types appear to have a kind field which indicates the 
object type but is not a pointer.  That suggests two object type 
indicators, a generic pointer (probably pointing to builtin? somewhere 
outside the struct) and a specific one (an enum) inside the C struct.


Inside the tuple struct, I count 4 undocumented words of information.
Over all, there is a length, the list of pointers, a kind, line, 
col and context; making 6 pieces in total.


Although your comment says the head pointer is not required; I found in 
3.3.0 that it is a true head pointer; The Tuple() function on line 2069 
of Python-ast.c, (3.3 version) -- is passed in a pointer called *elts.  
That pointer is copied into the Tuple struct.


How ironic,  slices don't have debugging info, that's the main reason 
they are smaller.

When I do slice(3,0,2), suprisingly Slice() is NOT called.
But when I do a[1:2:3] it *IS* called.




--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Michael Torrie

On 10/29/2012 01:34 PM, Andrew Robinson wrote:
 No, I don't think it big and complicated.  I do think it has timing 
 implications which are undesirable because of how *much* slices are used.
 In an embedded target -- I have to optimize; and I will have to reject 
 certain parts of Python to make it fit and run fast enough to be useful.

Since you can't port the full Python system to your embedded machine
anyway, why not just port a subset of python and modify it to suit your
needs right there in the C code.  It would be a fork, yes, but any port
to this target will be a fork anyway.  No matter how you cut it, it
won't be easy at all, and won't be easy to maintain.  You'll basically
be writing your own implementation of Python (that's what
python-on-a-chip is, and that's why it's closer to Python 2.x than
Python 3).  That's totally fine, though.  I get the impression you think
you will be able to port cPython as is to your target.  Without a libc,
an MMU on the CPU, and a kernel, it's not going to just compile and run.

Anyway, the only solution, given your constraints, is to implement your
own python interpreter to handle a subset of Python, and modify it to
suit your tastes.  What you want with slicing behavior changes has no
place in the normal cPython implementation, for a lot of reasons.  The
main one is that it is already possible to implement what you are
talking about in your own python class, which is a fine solution for a
normal computer with memory and CPU power available.
-- 
http://mail.python.org/mailman/listinfo/python-list

Negative array indicies and slice()

2012-10-28 Thread andrewr3mail

The slice operator does not give any way (I can find!) to take slices from 
negative to positive indexes, although the range is not empty, nor the expected 
indexes out of range that I am supplying.

Many programs that I write would require introducing variables and logical 
statements to correct the problem which is very lengthy and error prone unless 
there is a simple work around.

I *hate* replicating code every time I need to do this!

I also don't understand why slice() is not equivalent to an iterator, but can 
replace an integer in __getitem__() whereas xrange() can't.


Here's an example for Linux shell, otherwise remove /bin/env...
{{{#!/bin/env python
a=[1,2,3,4,5,6,7,8,9,10]
print a[-4:3]  # I am interested in getting [7,8,9,10,1,2] but I get [].
}}}
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-28 Thread Ian Kelly

On Sun, Oct 28, 2012 at 9:12 PM,  andrewr3m...@gmail.com wrote:
 The slice operator does not give any way (I can find!) to take slices from 
 negative to positive indexes, although the range is not empty, nor the 
 expected indexes out of range that I am supplying.

 Many programs that I write would require introducing variables and logical 
 statements to correct the problem which is very lengthy and error prone 
 unless there is a simple work around.

 I *hate* replicating code every time I need to do this!

 I also don't understand why slice() is not equivalent to an iterator, but can 
 replace an integer in __getitem__() whereas xrange() can't.


 Here's an example for Linux shell, otherwise remove /bin/env...
 {{{#!/bin/env python
 a=[1,2,3,4,5,6,7,8,9,10]
 print a[-4:3]  # I am interested in getting [7,8,9,10,1,2] but I get [].
 }}}


For a sequence of length 10, a[-4:3] is equivalent to a[6:3],
which is an empty slice since index 6 is after index 3.

If you want it to wrap around, then take two slices and concatenate
them with a[-4:] + a[:3].
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-28 Thread MRAB


On 2012-10-29 03:12, andrewr3m...@gmail.com wrote:

The slice operator does not give any way (I can find!) to take slices from 
negative to positive indexes, although the range is not empty, nor the expected 
indexes out of range that I am supplying.

Many programs that I write would require introducing variables and logical 
statements to correct the problem which is very lengthy and error prone unless 
there is a simple work around.

I *hate* replicating code every time I need to do this!

I also don't understand why slice() is not equivalent to an iterator, but can 
replace an integer in __getitem__() whereas xrange() can't.


Here's an example for Linux shell, otherwise remove /bin/env...
{{{#!/bin/env python
a=[1,2,3,4,5,6,7,8,9,10]
print a[-4:3]  # I am interested in getting [7,8,9,10,1,2] but I get [].
}}}


If the stride is positive (if omitted it defaults to 1), the slice is
from the start index to one before the end index, and a negative index
counts from the end.

a[-4:3] is equivalent to a[len(a)-4:3], which is an empty list if
len(a)-4 = 3.

It doesn't wrap around.

--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-28 Thread andrewr3mail

On Sunday, October 28, 2012 8:43:30 PM UTC-7, Ian wrote:
 On Sun, Oct 28, 2012 at 9:12 PM, andrew wrote:
 
  The slice operator does not give any way (I can find!) to take slices from 
  negative to positive indexes, although the range is not empty, nor the 
  expected indexes out of range that I am supplying.
 
 
 
  Many programs that I write would require introducing variables and logical 
  statements to correct the problem which is very lengthy and error prone 
  unless there is a simple work around.
 
 
 
  I *hate* replicating code every time I need to do this!
 
 
 
  I also don't understand why slice() is not equivalent to an iterator, but 
  can replace an integer in __getitem__() whereas xrange() can't.
 
 
 
 
 
  Here's an example for Linux shell, otherwise remove /bin/env...
 
  {{{#!/bin/env python
 
  a=[1,2,3,4,5,6,7,8,9,10]
 
  print a[-4:3]  # I am interested in getting [7,8,9,10,1,2] but I get [].
 
  }}}
 
 
 
 
 
 For a sequence of length 10, a[-4:3] is equivalent to a[6:3],
 
 which is an empty slice since index 6 is after index 3.
 
 
 
 If you want it to wrap around, then take two slices and concatenate
 
 them with a[-4:] + a[:3].

Hi Ian,
Well, no it really isn't equivalent.
Consider a programmer who writes:
xrange(-4,3) *wants* [-4,-3,-2,-1,0,1,2]

That is the idea of a range; for what reason would anyone *EVER* want -4 to 
+3 to be 6:3???

I do agree that the data held in -4 is equivalent to the data in 6, but the 
index is not the same.

So: Why does python choose to convert them to positive indexes, and have slice 
operate differently than xrange -- for the slice() object can't possibly know 
the size of the array when it is passed in to __getitem__;  They are totally 
separate classes.

I realize I can concat. two slice ranges, BUT, the ranges do not always span 
from negative to positive.

eg: a line in my program reads:
a[x-5:x]

if x is 7, then this is a positive index to a positive index.
So, there is no logic to using two slices concatd !

I use this arbitrary range code *often* so I need a general purpose solution.
I looked up slice() but the help is of no use, I don't even know how I might 
overload it to embed some logic to concatenate ranges of data; nor even if it 
is possible.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-28 Thread Andrew

On Sunday, October 28, 2012 8:43:30 PM UTC-7, Ian wrote:
 On Sun, Oct 28, 2012 at 9:12 PM,  Andrew wrote:
 
  The slice operator does not give any way (I can find!) to take slices from 
  negative to positive indexes, although the range is not empty, nor the 
  expected indexes out of range that I am supplying.
 
 
 
  Many programs that I write would require introducing variables and logical 
  statements to correct the problem which is very lengthy and error prone 
  unless there is a simple work around.
 
 
 
  I *hate* replicating code every time I need to do this!
 
 
 
  I also don't understand why slice() is not equivalent to an iterator, but 
  can replace an integer in __getitem__() whereas xrange() can't.
 
 
 
 
 
  Here's an example for Linux shell, otherwise remove /bin/env...
 
  {{{#!/bin/env python
 
  a=[1,2,3,4,5,6,7,8,9,10]
 
  print a[-4:3]  # I am interested in getting [7,8,9,10,1,2] but I get [].
 
  }}}
 
 
 
 
 
 For a sequence of length 10, a[-4:3] is equivalent to a[6:3],
 
 which is an empty slice since index 6 is after index 3.
 
 
 
 If you want it to wrap around, then take two slices and concatenate
 
 them with a[-4:] + a[:3].

Hi Ian,
Well, no it really isn't equivalent; although Python implements it as 
equivalent.

Consider a programmer who writes:
xrange(-4,3) 

They clearly *want* [-4,-3,-2,-1,0,1,2]

That is the idea of a range; So, for what reason would anyone want -4 to +3 
to be 6:3???  Can you show me some code where this is desirable??

I do agree that the data held in -4 is equivalent to the data in 6, but the 
index is not the same.

So: Why does python choose to convert them to positive indexes, and have slice 
operate differently than xrange -- for the slice() object can't possibly know 
the size of the array when it is passed in to __getitem__;  They are totally 
separate classes.

I realize I can concat. two slice ranges, BUT, the ranges do not always span 
from negative to positive.

eg: a line in my program reads:
a[x-5:x]

if x is 7, then this is a positive index to a positive index.
So, there is no logic to using two slices concatd !

I use this arbitrary range code *often* so I need a general purpose solution.
I looked up slice() but the help is of no use, I don't even know how I might 
overload it to embed some logic to concatenate ranges of data; nor even if it 
is possible. 
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-28 Thread Ian Kelly

On Sun, Oct 28, 2012 at 10:00 PM,  andrewr3m...@gmail.com wrote:
 Hi Ian,
 Well, no it really isn't equivalent.
 Consider a programmer who writes:
 xrange(-4,3) *wants* [-4,-3,-2,-1,0,1,2]

 That is the idea of a range; for what reason would anyone *EVER* want -4 to 
 +3 to be 6:3???

That is what ranges do, but your question was about slices, not ranges.

 So: Why does python choose to convert them to positive indexes, and have 
 slice operate differently than xrange -- for the slice() object can't 
 possibly know the size of the array when it is passed in to __getitem__;  
 They are totally separate classes.

Ranges can contain negative integers.  However, sequences do not have
negative indices.  Therefore, negative indices in slices are used to
count from the end instead of from the start.  As stated in the
language docs, If either bound is negative, the sequence’s length is
added to it.  Therefore, a[-4:3] does not wrap around the end of
the sequence because a[6:3] does not wrap around the end of the
sequence.

 I realize I can concat. two slice ranges, BUT, the ranges do not always span 
 from negative to positive.

def wrapping_slice(seq, start, stop):
start, stop, _ = slice(start, stop).indices(len(seq))
if start = stop:
return seq[start:stop]
else:
return seq[start:] + seq[:stop]

You'll have to decide for yourself whether you want it to return an
empty list or the entire list if start == stop.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-28 Thread alex23

On Oct 29, 2:09 pm, Andrew andrewr3m...@gmail.com wrote:
 I use this arbitrary range code *often* so I need a general purpose solution.
 I looked up slice() but the help is of no use, I don't even know how I might
 overload it to embed some logic to concatenate ranges of data; nor even if
 it is possible.

Slices are passed in if provided to __getitem__/__setitem__/
__delitem__, so you'd need to override it at the list level:

class RangedSlicer(list):
def __getitem__(self, item):
# map item.start, .stop and .step to your own semantics

Then wrap your lists with your RangedSlicer class as needed.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-28 Thread Paul Rubin

Andrew andrewr3m...@gmail.com writes:
 So: Why does python choose to convert them to positive indexes, and
 have slice operate differently than xrange 

There was a thread a few years back, I think started by Bryan Olson,
that made the case that slice indexing is a Python wart for further
reasons than the above, and suggesting a notation like x[$-5] to denote
what we now call x[-5] (i.e. $ is the length of the string).  So your
example x[$-4:3] would clearly be the same thing as x[6:3] and not give
any suggestion that it might wrap around.
-- 
http://mail.python.org/mailman/listinfo/python-list

92 matches

Mail list logo