Fwd: Re: Negative array indicies and slice()
Forwarded to python list: Original Message Subject:Re: Negative array indicies and slice() Date: Sat, 03 Nov 2012 15:32:04 -0700 From: Andrew Robinson Reply-To: andr...@r3dsolutions.com To: Ian Kelly On 11/01/2012 05:32 PM, Ian Kelly wrote: On Thu, Nov 1, 2012 at 4:25 PM, Andrew Robinson The bottom line is: __getitem__ must always *PASS* len( seq ) to slice() each *time* the slice() object is-used. Since this is the case, it would have been better to have list, itself, have a default member which takes the raw slice indicies and does the conversion itself. The size would not need to be duplicated or passed -- memory savings, speed savings... And then tuple would need to duplicate the same code. As would deque. And str. And numpy.array, and anything else that can be sliced, including custom sequence classes. I don't think that's true. A generic function can be shared among different objects without being embedded in an external index data structure to boot! If *self* were passed to an index conversion function (as would naturally happen anyway if it were a method), then the method could take len( self ) without knowing what the object is; Should the object be sliceable -- the len() will definitely return the required piece of information. Numpy arrays are very different internally from lists. Of course! (Although, lists do allow nested lists.) I'm not understanding what this is meant to demonstrate. Is MyClass a find-replace error of ThirdParty? Why do you have __getitem__ returning slice objects instead of items or subsequences? What does this example have to do with numpy? Here's a very cleaned up example file, cut and pastable: #!/bin/env python # File: sliceIt.py --- a pre PEP357 hypothesis test skeleton class Float16(): Numpy creates a float type, with very limited precision -- float16 Rather than force you to install np for this test, I'm just making a faux object. normally we'd just import np def __init__(self,value): self.value = value def AltPEP357Solution(self): This is doing exactly what __index__ would be doing. return None if self.value is None else int( self.value ) class ThirdParty( list ): A simple class to implement a list wrapper, having all the properties of a normal list -- but explicitly showing portions of the interface. def __init__(self, aList): self.aList = aList def __getitem__(self, aSlice): print( __getitems__, aSlice ) temp=[] edges = aSlice.indices( len( self.aList ) ) # *unavoidable* call for i in range( *edges ): temp.append( self.aList[ i ] ) return temp def Inject_FloatSliceFilter( theClass ): This is a courtesy function to allow injecting (duck punching) a float index filter into a user object. def Filter_FloatSlice( self, aSlice ): # Single index retrieval filter try: start=aSlice.AltPEP357Solution() except AttributeError: pass else: return self.aList[ start ] # slice retrieval filter try: start=aSlice.start.AltPEP357Solution() except AttributeError: start=aSlice.start try: stop=aSlice.stop.AltPEP357Solution() except AttributeError: stop=aSlice.stop try: step=aSlice.step.AltPEP357Solution() except AttributeError: step=aSlice.step print( Filter To,start,stop,step ) return self.super_FloatSlice__getitem__( slice(start,stop,step) ) theClass.super_FloatSlice__getitem__ = theClass.__getitem__ theClass.__getitem__ = Filter_FloatSlice # EOF: sliceIt.py Example run: from sliceIt import * test = ThirdParty( [1,2,3,4,5,6,7,8,9] ) test[0:6:3] ('__getitems__', slice(0, 6, 3)) [1, 4] f16=Float16(8.3) test[0:f16:2] ('__getitems__', slice(0,sliceIt.Float16 instance at 0xb74baaac, 2)) Traceback (most recent call last): File stdin, line 1, inmodule File sliceIt.py, line 26, in __getitem__ edges = aSlice.indices( len( self.aList ) ) # This is an *unavoidable* call TypeError: object cannot be interpreted as an index Inject_FloatSliceFilter( ThirdParty ) test[0:f16:2] ('Filter To', 0, 8, 2) ('__getitems__', slice(0, 8, 2)) [1, 3, 5, 7] test[f16] 9 We could also require the user to explicitly declare when they're performing arithmetic on variables that might not be floats. Then we can turn off run-time type checking unless the user explicitly requests it, all in the name of micro-optimization and explicitness. :) None of those would help micro-optimization that I can see. Seriously, whether x is usable as a sequence index is a property of x, not a property of the sequence. Yes, but the *LENGTH* of the sequence is a function of the *sequence*. Users shouldn't need to pick and choose *which* particular sequence index types their custom sequences are willing
Re: Negative array indicies and slice()
Hi Ian, I apologize for trying your patience with the badly written code example. All objects were meant to be ThirdParty(), the demo was only to show how a slice() filter could have been applied for the reasons PEP357 made index() to exist. eg: because numpy items passed to __getitems__ via slice syntax [::] were illegal values. PEP 357 is the one who specifically mentioned Numpy types -- which is the only reason I used the name in the example; I could have just as well used a string. I am fully aware of what numpy does -- I have used it; modified the fortran interfaces underneath, etc. The index() method, however, affects *all* list objects in Python, not just Numpy's -- correct? I'll write a working piece of code tomorrow to demonstrate the filter very clearly rather than a skeleton, and test it before posting. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On 11/2/12 8:57 AM, Andrew Robinson wrote: Hi Ian, I apologize for trying your patience with the badly written code example. All objects were meant to be ThirdParty(), the demo was only to show how a slice() filter could have been applied for the reasons PEP357 made index() to exist. eg: because numpy items passed to __getitems__ via slice syntax [::] were illegal values. PEP 357 is the one who specifically mentioned Numpy types -- which is the only reason I used the name in the example; I could have just as well used a string. I am fully aware of what numpy does -- I have used it; modified the fortran interfaces underneath, etc. The index() method, however, affects *all* list objects in Python, not just Numpy's -- correct? Please forget that PEP 357 mentions slices at all. The motivation for the __index__() method (not index()) goes far beyond slices. I'm not really sure why they are given such a prominent place in the PEP. Let me try to lay out the motivation more clearly. numpy has objects that represent integers but cannot be subclasses of the Python int or long objects because their internal representations are different. These are the width-specific types: uint8, int16, int64, etc. Before __index__() was introduced, all indexing operations in the builtin Python sequence types strictly checked for int or long objects and rejected other objects. We wanted to provide a generic method that third party types could implement to say, Yes, I really am an integer, here is my value in a canonical representation you can understand. We could not use __int__() for this purpose because it has additional semantics, namely conversion from not-integers to integers. This is why floats are mentioned; they do not generally represent integers but they do define an __int__() method for their conversion to ints via the floor() function. Generally, they should be rejected as indices. With the __index__() method, we have a solution: int16 and the rest get __index__() methods and float doesn't. This is used where an integer index or offset is needed, not just in slices. List indices, file.seek(), mmap.mmap(), etc. The change to use PyIndex_Check() instead of PyInt_Check() was not very difficult or extensive. Even if you were to change the slicing API for your other reasons, __index__() would still be needed. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On 10/31/12 8:16 PM, Andrew Robinson wrote: On 10/31/2012 02:20 PM, Ian Kelly wrote: On Wed, Oct 31, 2012 at 7:42 AM, Andrew Robinson wrote: Then; I'd note: The non-goofy purpose of slice is to hold three data values; They are either numbers or None. These *normally* encountered values can't create a memory loop. So, FOR AS LONG, as the object representing slice does not contain an explicit GC pair; I move that we mandate (yes, in the current python implementation, even as a *fix*) that its named members may not be assigned any objects other than None or numbers eg: Lists would be forbidden Since functions, and subclasses, can be test evaluated by int( the_thing_to_try ) and *[] can too, generality need not be lost for generating nothing or numbers. PEP 357 requires that anything implementing the __index__ special method be allowed for slicing sequences (and also that __index__ be used for the conversion). For the most part, that includes ints and numpy integer types, but other code could be doing esoteric things with it. I missed something... (but then that's why we're still talking about it...) Reading the PEP, it notes that *only* integers (or longs) are permitted in slice syntax. (Overlooking None, of course... which is strange...) The PEP gives the only exceptions as objects with method __index__. Automatically, then, an empty list is forbidden (in slice syntax). However, What you did, was circumvent the PEP by passing an empty list directly to slice(), and avoiding running it through slice syntax processing. Why do you think it is forbidden by the syntax? [~] |1 class A(object): .. def __getitem__(self, key): .. return key .. [~] |2 a = A() [~] |3 a[[]:] slice([], None, None) The PEP is a little unclear and refers to a state of the Python interpreter that no longer exists. At the time, I think __getslice__() was still not deprecated, and it did require ints (or after the PEP, __index__able objects). __getslice__() is now deprecated in favor of __getitem__() where you can interpret slice objects with arbitrary objects to your heart's content. Arbitrary objects *are* definitely allowed by the slice syntax (how could the syntax know what is an int and what is another kind of object?). Most objects that interpret slices, especially the builtin sequence types, do require __index__able objects (or None). So, what's the psychology behind allowing slice() to hold objects which are not converted to ints/longs in the first place? In numpy, we (ab)use this freedom for some convenient notation in special objects. We have a couple of grid-making convenience objects: [~] |5 numpy.mgrid[1.5:2.5:0.1] array([ 1.5, 1.6, 1.7, 1.8, 1.9, 2. , 2.1, 2.2, 2.3, 2.4]) This syntax uses the start:stop:step notation to make a float range. If we use an imaginary integer in the step slot, mgrid will interpret it as the number of items requested instead of the step. [~] |6 numpy.mgrid[1.5:2.5:11j] array([ 1.5, 1.6, 1.7, 1.8, 1.9, 2. , 2.1, 2.2, 2.3, 2.4, 2.5]) -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
Andrew Robinson wrote: On 10/31/2012 02:20 PM, Ian Kelly wrote: On Wed, Oct 31, 2012 at 7:42 AM, Andrew Robinson wrote: Then; I'd note: The non-goofy purpose of slice is to hold three data values; They are either numbers or None. These *normally* encountered values can't create a memory loop. So, FOR AS LONG, as the object representing slice does not contain an explicit GC pair; I move that we mandate (yes, in the current python implementation, even as a *fix*) that its named members may not be assigned any objects other than None or numbers eg: Lists would be forbidden Since functions, and subclasses, can be test evaluated by int( the_thing_to_try ) and *[] can too, generality need not be lost for generating nothing or numbers. PEP 357 requires that anything implementing the __index__ special method be allowed for slicing sequences (and also that __index__ be used for the conversion). For the most part, that includes ints and numpy integer types, but other code could be doing esoteric things with it. I missed something... (but then that's why we're still talking about it...) Reading the PEP, it notes that *only* integers (or longs) are permitted in slice syntax. Keep in mind that PEPs represent Python /at that time/ -- as Python moves forward, PEPs are not updated (this has gotten me a couple times). The change would be backward-incompatible in any case, since there is certainly code out there that uses non-numeric slices -- one example has already been given in this thread. Hmmm. Now, I'm thinking -- The purpose of index(), specifically, is to notify when something which is not an integer may be used as an index; You've helpfully noted that index() also *converts* those objects into numbers. Ethan Fullman mentioned that he used the names of fields, instead of having to remember the _offsets_; Which means that his values _do convert_ to offset numbers Furman, actually. :) And my values do *not* convert to indices (at least, not automatically). My __getitem__ code looks like: elif isinstance(item, slice): sequence = [] if isinstance(item.start, (str, unicode)) \ or isinstance(item.stop, (str, unicode)): field_names = dbf.field_names(self) start, stop, step = item.start, item.stop, item.step if start not in field_names or stop not in field_names: raise MissingFieldError( Either %r or %r (or both) are not valid field names % (start, stop)) if step is not None and not isinstance(step, (int, long)): raise DbfError( step value must be an int or long, not %r % type(step)) start = field_names.index(start) stop = field_names.index(stop) + 1 item = slice(start, stop, step) for index in self._meta.fields[item]: sequence.append(self[index]) return sequence In other words, the slice contains the strings, and my code calculates the offsets -- Python doesn't do it for me. His example was actually given in slice syntax notation [::]. Hence, his objects must have an index() method, correct?. Nope. ~Ethan~ -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Fri, Nov 2, 2012 at 1:12 AM, Ethan Furman et...@stoneleaf.us wrote: In other words, the slice contains the strings, and my code calculates the offsets -- Python doesn't do it for me. That's correct, but you're still translating those strings into numeric indices. You can slice a database record based on column names (though personally I would recommend against it - creates too much dependence on column order, which I prefer to treat as non-significant), but you can't, for instance, slice a dictionary by keys: foo = {asdf:123,qwer:234,zxcv:345,1234:456} foo[qwer:1234] # What should this return? I suppose conceptually you could slice any iterable by discarding till you match the start, then yielding till you match the stop, then returning (it'd function like itertools.islice but using non-numeric indices - somehow). But it still depends on there being a dependable order. (Incidentally, isinstance(X, (str, unicode)) can become isinstance(X, basestring) - they both inherit from that.) ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
Chris Angelico wrote: On Fri, Nov 2, 2012 at 1:12 AM, Ethan Furman et...@stoneleaf.us wrote: In other words, the slice contains the strings, and my code calculates the offsets -- Python doesn't do it for me. That's correct, but you're still translating those strings into numeric indices. True, but the point is that the /slice/ contains a data type that is neither a number, nor directly translatable into a number (that is, no __index__ method), and my code would cease to function should that change to slices be made. ~Ethan~ -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On 11/01/2012 07:12 AM, Ethan Furman wrote: Andrew Robinson wrote: On 10/31/2012 02:20 PM, Ian Kelly wrote: On Wed, Oct 31, 2012 at 7:42 AM, Andrew Robinson wrote: Then; I'd note: The non-goofy purpose of slice is to hold three data values; They are either numbers or None. These *normally* encountered values can't create a memory loop. So, FOR AS LONG, as the object representing slice does not contain an explicit GC pair; snip A little review... The premise of my statement here, is that Tim Peter's closed the Bug report; http://bugs.python.org/issue1501180 With the *reason* being that using GC was *goofy* on account of what slice() was intended to hold, None and a number. So, My first attempt at bug fix was simply to take Tim Peter's at his word... since we all assume he *isn't* a Bloody Idiot. Hey isn't that a swear-word somewhere in the world? Its not where I live, but I seem to recall... oh, well... whatever. I missed something... (but then that's why we're still talking about it...) Reading the PEP, it notes that *only* integers (or longs) are permitted in slice syntax. Keep in mind that PEPs represent Python /at that time/ -- as Python moves forward, PEPs are not updated (this has gotten me a couple times). And, since I am reading them in the order written (but in 3.0) trying to get the whole of Python into my mind on the journey to prep for porting it into a tiny chip -- I'm frustrated by not being finished yet... Furman, actually. :) :-! And my values do *not* convert to indices (at least, not automatically). Ahhh (Rhetorical sarcastic) I was wondering how you added index() method to strings, not access it, and still be following the special PEP we are talking about,when you gave that example using unwrapped strings. -- H was that PEP the active state of Python, when Tim rejected the bug report? eg: have we moved on into a place where the bug report ought to be re-issued since that PEP is now *effectively* passe, and Tim could thus be vindicated from being a b... Idiot? (Or has he been given the 1st place, Python Twit award -- and his *man* the bug list been stripped?) In other words, the slice contains the strings, and my code calculates the offsets -- Python doesn't do it for me. ~Ethan~ I see, so the problem is that PEP wants you to implement the index(), but that is going to cause you to subclass string, and add a wrapper interface every time you need to index something. eg: doing something llke --- mydbclass[ MyString( 'fromColumn' ) : MyString( 'toColum' ) ] and the code becomes a candy machine interface issue (Chapter 5, Writing Solid Code). My favorite line there uses no swearing If they had just taken an extra *30* seconds thinking about their design, they could have saved me, and I'm sure countless others, from getting something they didn't want. I laugh, if they didn't get it already -- an extra *30* seconds is WY to optimistic. Try minutes at least, will a policeman glaring over their shoulder. But anyhow --- The problem lies in *when* the conversion to an integer is to take place, not so much if it is going to happen. Your indexes, no matter how disguised, eventually will become numbers; and you have a way that minimizes coding cruft (The very reason I started the thread, actually... subclassing trivially to fix candy machine interfaces leads to perpetual code increases -- In cPython source-code, realloc wrappers and malloc wrappers are found I've seen these wrappers *re*-invented in nearly every C program I've every looked at! Talk about MAN-hours, wasted space, and cruft.) So; is this a reasonable summary of salient features (status quo) ? * Enforcing strict numerical indexes (in the slice [::] operator) causes much psychological angst when attempting to write clear code without lots of wrapper cruft. * Pep 357 merely added cruft with index(), but really solved nothing. Everything index() does could be implemented in __getitem__ and usually is. * slice().xxxs are merely a container for *whatever* was passed to [::] * slice() is * slice is also a full blown object, which implements a trivial method to dump the contents of itself to a tuple. * presently slice() allows memory leaks through GC loops. * Slice(), even though an object with a constructor, does no error checking to deny construction of memory leaks. If people would take 30 seconds to think about this the more details added -- the more comprehensive can be my understanding -- and perhaps a consensus reached about the problem. These are a list of relevant options, without respect to feasability. * Don't bother to fix the bug; allow Python to crash with a subtle bug that often take weeks to track down by the very small minority doing strange things (Equivalent to the monkey patch syndrome of D'Aprano; BTW: The longer the bug is left unfixed, the
Re: Negative array indicies and slice()
On Thu, Nov 1, 2012 at 10:32 PM, Andrew Robinson andr...@r3dsolutions.com wrote: presently slice() allows memory leaks through GC loops. Forgive me if I've missed something here, but isn't it only possible to make a refloop by decidedly abnormal behaviour? Stuff like: a=[]; a.append(slice(a)) Seriously, who does this? First you have to have a reference to a container in place of an index, and then you have to retain the slice object inside that same container as well. Neither operation is normal use of a slice. Where is the problem? ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Thu, Nov 1, 2012 at 5:32 AM, Andrew Robinson andr...@r3dsolutions.com wrote: H was that PEP the active state of Python, when Tim rejected the bug report? Yes. The PEP was accepted and committed in March 2006 for release in Python 2.5. The bug report is from June 2006 has a version classification of Python 2.5, although 2.5 was not actually released until September 2006. Pep 357 merely added cruft with index(), but really solved nothing. Everything index() does could be implemented in __getitem__ and usually is. No. There is a significant difference between implementing this on the container versus implementing it on the indexes. Ethan implemented his string-based slicing on the container, because the behavior he wanted was specific to the container type, not the index type. Custom index types like numpy integers on the other hand implement __index__ on the index type, because they apply to all sequences, not specific containers. This must be separate from standard int conversion, because standard int conversion is too general for indexing. slice is also a full blown object, which implements a trivial method to dump the contents of itself to a tuple. slice.indices() does not trivially dump its contents as given. It takes a sequence length and adjusts its indices to that length. The C implementation of this is around 60 lines of code. Don't bother to fix the bug; allow Python to crash with a subtle bug that often take weeks to track down by the very small minority doing strange things (Equivalent to the monkey patch syndrome of D'Aprano; BTW: The longer the bug is left unfixed, the more people will invent uses for it ) It's been 6 years already. AFAIK nobody has invented any uses that are actually at risk of invoking the GC bug. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
Ian Kelly wrote: On Thu, Nov 1, 2012 at 5:32 AM, Andrew Robinson wrote: Don't bother to fix the bug; allow Python to crash with a subtle bug that often take weeks to track down by the very small minority doing strange things (Equivalent to the monkey patch syndrome of D'Aprano; BTW: The longer the bug is left unfixed, the more people will invent uses for it ) It's been 6 years already. AFAIK nobody has invented any uses that are actually at risk of invoking the GC bug. The bug is not that slice allows non-numbers, but that slice objects aren't tracked by gc; I'm not seeing an issue with not fixing the bug. ~Ethan~ -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On 11/01/2012 12:07 PM, Ian Kelly wrote: On Thu, Nov 1, 2012 at 5:32 AM, Andrew Robinson andr...@r3dsolutions.com wrote: H was that PEP the active state of Python, when Tim rejected the bug report? Yes. The PEP was accepted and committed in March 2006 for release in Python 2.5. The bug report is from June 2006 has a version classification of Python 2.5, although 2.5 was not actually released until September 2006. That explain's Peter's remark. Thank you. He looks *much* smarter now. Pep 357 merely added cruft with index(), but really solved nothing. Everything index() does could be implemented in __getitem__ and usually is. No. There is a significant difference between implementing this on the container versus implementing it on the indexes. Ethan implemented his string-based slicing on the container, because the behavior he wanted was specific to the container type, not the index type. Custom index types like numpy integers on the other hand implement __index__ on the index type, because they apply to all sequences, not specific containers. Hmmm... D'Aprano didn't like the monkey patch;and sub-classing was his fix-all. Part of my summary is based on that conversation with him,and you touched on one of the unfinished points; I responded to him that I thought __getitem__ was under-developed. The object slice() has no knowledge of the size of the sequence; nor can it get that size on it's own, but must passively wait for it to be given to it. The bottom line is: __getitem__ must always *PASS* len( seq ) to slice() each *time* the slice() object is-used. Since this is the case, it would have been better to have list, itself, have a default member which takes the raw slice indicies and does the conversion itself. The size would not need to be duplicated or passed -- memory savings, speed savings... I'm just clay pidgeoning an idea out here Let's apply D'Aprano 's logic to numpy; Numpy could just have subclassed *list*; so let's ignore pure python as a reason to do anything on the behalf on Numpy: Then, lets' consider all thrid party classes; These are where subclassing becomes a pain -- BUT: I think those could all have been injected. class ThirdParty( list ): # Pretend this is someone else's... ... def __init__(self): return ... def __getitem__(self,aSlice): return aSlice ... We know it will default work like this: a=ThirdParty() a[1:2] slice(1, 2, None) # So, here's an injection... ThirdParty.superOnlyOfNumpy__getitem__ = MyClass.__getitem__ ThirdParty.__getitem__ = lambda self,aSlice: ( 1, 3, self.superOnlyOfNumpy__getitem__(aSlice ).step ) a[5:6] (1, 3, None) Numpy could have exported a (workable) function that would modify other list functions to affect ONLY numpy data types (eg: a filter). This allows user's creating their own classes to inject them with Numpy's filter only when they desire; Recall Tim Peter's explicit is better than implicit Zen? Most importantly normal programs not using Numpy wouldn't have had to carry around an extra API check for index() *every* single time the heavily used [::] happened. Memory speed both. It's also a monkey patch, in that index() allows *conflicting* assumptions in violation of the unexpected monkey patch interaction worry. eg: Numpy *CAN* release an index() function on their floats -- at which point a basic no touch class (list itself) will now accept float as an index in direct contradiction of PEP 357's comment on floats... see? My point isn't that this particular implementation I have shown is the best (or even really safe, I'd have to think about that for a while). Go ahead and shoot it down... My point is that, the methods found in slice(), and index() now have moved all the code regarding a sequence *out* of the object which has information on that sequence. It smacks of legacy. The Python parser takes values from many other syntactical constructions and passes them directly to their respective objects -- but in the case of list(), we have a complicated relationship; and not for any reason that can't be handled in a simpler way. Don't consider the present API legacy for a moment, I'm asking hypothetical design questions: How many users actually keep slice() around from every instance of [::] they use? If it is rare, why create the slice() object in the first place and constantly be allocating and de-allocating memory, twice over? (once for the original, and once for the repetitive method which computes dynamic values?) Would a single mutable have less overhead, since it is destroyed anyway? -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Thu, Nov 1, 2012 at 4:25 PM, Andrew Robinson andr...@r3dsolutions.com wrote: The bottom line is: __getitem__ must always *PASS* len( seq ) to slice() each *time* the slice() object is-used. Since this is the case, it would have been better to have list, itself, have a default member which takes the raw slice indicies and does the conversion itself. The size would not need to be duplicated or passed -- memory savings, speed savings... And then tuple would need to duplicate the same code. As would deque. And str. And numpy.array, and anything else that can be sliced, including custom sequence classes. Let's apply D'Aprano 's logic to numpy; Numpy could just have subclassed *list*; Numpy arrays are very different internally from lists. They are basically fancy wrappers of C arrays, whereas lists are a higher-level abstraction. They allow for multiple dimensions, which lists do not. Slices of numpy arrays produce views, whereas slices of lists produce brand new lists. And they certainly do not obey the Liskov Substitution Principle with respect to lists. class ThirdParty( list ): # Pretend this is someone else's... ... def __init__(self): return ... def __getitem__(self,aSlice): return aSlice ... We know it will default work like this: a=ThirdParty() a[1:2] slice(1, 2, None) # So, here's an injection... ThirdParty.superOnlyOfNumpy__getitem__ = MyClass.__getitem__ ThirdParty.__getitem__ = lambda self,aSlice: ( 1, 3, self.superOnlyOfNumpy__getitem__(aSlice ).step ) a[5:6] (1, 3, None) I'm not understanding what this is meant to demonstrate. Is MyClass a find-replace error of ThirdParty? Why do you have __getitem__ returning slice objects instead of items or subsequences? What does this example have to do with numpy? Numpy could have exported a (workable) function that would modify other list functions to affect ONLY numpy data types (eg: a filter). This allows user's creating their own classes to inject them with Numpy's filter only when they desire; Recall Tim Peter's explicit is better than implicit Zen? We could also require the user to explicitly declare when they're performing arithmetic on variables that might not be floats. Then we can turn off run-time type checking unless the user explicitly requests it, all in the name of micro-optimization and explicitness. Seriously, whether x is usable as a sequence index is a property of x, not a property of the sequence. Users shouldn't need to pick and choose *which* particular sequence index types their custom sequences are willing to accept. They should even be able to accept sequence index types that haven't been written yet. Most importantly normal programs not using Numpy wouldn't have had to carry around an extra API check for index() *every* single time the heavily used [::] happened. Memory speed both. The O(1) __index__ check is probably rather inconsequential compared to the O(n) cost of actually performing the slicing. It's also a monkey patch, in that index() allows *conflicting* assumptions in violation of the unexpected monkey patch interaction worry. eg: Numpy *CAN* release an index() function on their floats -- at which point a basic no touch class (list itself) will now accept float as an index in direct contradiction of PEP 357's comment on floats... see? Such a change would only affect numpy floats, not all floats, so it would not be a monkey-patch. In any case, that would be incorrect usage of __index__. We're all consenting adults here; we don't need supervision to protect us from buggy third-party code. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
andrew...@gmail.com於 2012年10月29日星期一UTC+8上午11時12分11秒寫道: The slice operator does not give any way (I can find!) to take slices from negative to positive indexes, although the range is not empty, nor the expected indexes out of range that I am supplying. Many programs that I write would require introducing variables and logical statements to correct the problem which is very lengthy and error prone unless there is a simple work around. I *hate* replicating code every time I need to do this! I also don't understand why slice() is not equivalent to an iterator, but can replace an integer in __getitem__() whereas xrange() can't. Here's an example for Linux shell, otherwise remove /bin/env... {{{#!/bin/env python a=[1,2,3,4,5,6,7,8,9,10] print a[-4:3] # I am interested in getting [7,8,9,10,1,2] but I get []. }}} I'll suggest to use the reverse method to get what you want. Of course, the reverse method is not efficient for a list of a huge number of objects in python. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Thu, 01 Nov 2012 15:25:51 -0700, Andrew Robinson wrote: On 11/01/2012 12:07 PM, Ian Kelly wrote: Pep 357 merely added cruft with index(), but really solved nothing. Everything index() does could be implemented in __getitem__ and usually is. No. There is a significant difference between implementing this on the container versus implementing it on the indexes. Ethan implemented his string-based slicing on the container, because the behavior he wanted was specific to the container type, not the index type. Custom index types like numpy integers on the other hand implement __index__ on the index type, because they apply to all sequences, not specific containers. Hmmm... D'Aprano didn't like the monkey patch;and sub-classing was his fix-all. I pointed out that monkey-patching is a bad idea, even if it worked. But it doesn't work -- you simply cannot monkey-patch built-ins in Python. Regardless of whether I like the m-p or not, *you can't use it* because you patch built-in list methods. The best you could do is subclass list, then shadow the built-in name list with your subclass. But that gives all sorts of problems too, in some ways even worse than monkey-patching. You started this thread with a question about slicing. You believe that one particular use-case for slicing, which involves interpreting lists as circular rather than linear, is the use-case that built-in list slicing should have supported. Fine, you're entitled to your option. But that boat has sailed about 20 years ago. Python didn't make that choice, and it won't change now. If you write up a PEP, you could aim to have the built-in behaviour changed for Python 4 in perhaps another 10-15 years or so. But for the time being, that's not what lists, tuples, strings, etc. do. If you want that behaviour, if you want a circular list, then you have to implement it yourself, and the easiest way to do so is with a subclass. That's not a fix-all. I certainly don't believe that subclassing is the *only* way to fix this, nor that it will fix all things. But it might fix *some* things, such as you wanting a data type that is like a circular list rather than a linear list. If you prefer to create a circular-list class from scratch, re- implementing all the list-like behaviour, instead of inheriting from an existing class, then by all means go right ahead. If you have a good reason to spend days or weeks writing, testing, debugging and fine-tuning your new class, instead of about 15 minutes with a subclass, then I'm certainly not going to tell you not to. Part of my summary is based on that conversation with him,and you touched on one of the unfinished points; I responded to him that I thought __getitem__ was under-developed. The object slice() has no knowledge of the size of the sequence; nor can it get that size on it's own, but must passively wait for it to be given to it. That's because the slice object is independent of the sequence. As I demonstrated, you can pass a slice object to multiple sequences. This is a feature, not a bug. The bottom line is: __getitem__ must always *PASS* len( seq ) to slice() each *time* the slice() object is-used. The bottom line is: even if you are right, so what? The slice object doesn't know what the length of the sequence is. What makes you think that __getitem__ passes the length to slice()? Why would it need to recreate a slice object that already exists? It is the *sequence*, not the slice object, that is responsible for extracting the appropriate items when __getitem__ is called. __getitem__ gets a slice object as argument, it doesn't create one. It no more creates the slice object than mylist[5] creates the int 5. Since this is the case, But it isn't. it would have been better to have list, itself, have a default member which takes the raw slice indicies and does the conversion itself. The size would not need to be duplicated or passed -- memory savings, speed savings... We have already demonstrated that slice objects are smaller than (x)range objects and three-item tuples. In Python 3.3: py sys.getsizeof(range(1, 10, 2)) # xrange remained in Python 3 24 py sys.getsizeof((1, 10, 2)) 36 py sys.getsizeof(slice(1, 10, 2)) 20 It might help you to be taken seriously if you base your reasoning on Python as it actually is, rather than counter-factual assumptions. I'm just clay pidgeoning an idea out here Let's apply D'Aprano 's logic to numpy; Numpy could just have subclassed *list*; Sure they could have, if numpy arrays were intended to be a small variation on Python lists. But they weren't, so they didn't. so let's ignore pure python as a reason to do anything on the behalf on Numpy: Then, lets' consider all thrid party classes; These are where subclassing becomes a pain -- BUT: I think those could all have been injected. class ThirdParty( list ): # Pretend this is someone else's... ... def
Re: Negative array indicies and slice()
On 10/30/2012 10:29 PM, Michael Torrie wrote: As this is the case, why this long discussion? If you are arguing for a change in Python to make it compatible with what this fork you are going to create will do, this has already been fairly thoroughly addressed earl on, and reasons why the semantics will not change anytime soon have been given. I'm not arguing for a change in the present release of Python; and I have never done so. Historically, if a fork happens to produce something surprisingly _useful_; the main code bank eventually accepts it on their own. If a fork is a mistake, it dies on its own. That really is the way things ought to be done. include this The Zen of Python, by _Tim Peters_ Special cases aren't special enough to break the rules. Although _practicality beats purity_. Now, I have seen several coded projects where the idea of cyclic lists is PRACTICAL; and the idea of iterating slices may be practical if they could be made *FASTER*. These warrant looking into -- and carefully; and that means making an experimental fork; preferably before I attempt to micro-port the python. Regarding the continuing discussion: The more I learn, the more informed decisions I can make regarding implementation. I am almost fully understanding the questions I originally asked, now. What remains are mostly questions about compatibility wrappers, and how to allow them to be used -- or selectively deleted when not necessary; and perhaps a demonstration or two about how slices and named tuples can (or can't) perform nearly the same function in slice processing. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Tue, Oct 30, 2012 at 4:25 PM, Andrew Robinson andr...@r3dsolutions.com wrote: Ian, Looks like it's already been wontfixed back in 2006: http://bugs.python.org/issue1501180 Absolutely bloody typical, turned down because of an idiot. Who the hell is Tim Peters anyway? I don't really disagree with him, anyway. It is a rather obscure bug -- is it worth increasing the memory footprint of slice objects by 80% in order to fix it? :D In either event, a *bug* does exist (at *least* 20% of the time.) Tim Peters could have opened the *appropriate* bug complaint if he rejected the inappropriate one. Where are you getting that 20% figure from? Reference cycles involving slice objects would be extremely rare, certainly far less than 20%. The API ought to have either 1) included the garbage collection, or 2) raised an exception anytime dangerous/leaky data was supplied to slice(). How would you propose detecting the latter? At the time data is supplied to slice() it cannot refer to the slice, as the slice does not exist yet. The cycle has to be created after. If it is worth getting rid of the 4 words of extra memory required for the GC -- on account of slice() refusing to support data with sub-objects; then I'd also point out that a very large percentage of the time, tuples also contain data (typically integers or floats,) which do not further sub-reference objects. Hence, it would be worth it there too. I disagree. The proportion of the time that a tuple contains other collection objects is *much* greater. This happens regularly. OTOH, if I had to hazard a guess at the frequency with which non-atomic objects are used in slices, it would be a fraction of a fraction of a fraction of a percent. I came across some unexpected behavior in Python 3.2 when experimenting with ranges and replacement Consider, xrange is missing, BUT: More accurately, range is gone, and xrange has been renamed range. a=range(1,5,2) a[1] 3 a[2] 5 a[1:2] range(3, 5, 2) Now, I wondered if it would still print the array or not; eg: if this was a __str__ issue vs. __repr__. print( a[1:2] ) # Boy, I have to get used to the print's parenthesis range(3, 5, 2) So, the answer is *NOPE*. I'm not sure why you would expect it to print a list here, without an explicit conversion. The result of calling range in Python 3 is a range object, not a list. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Tue, 30 Oct 2012 21:33:32 +, Mark Lawrence wrote: On 30/10/2012 18:02, Ian Kelly wrote: On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furman et...@stoneleaf.us wrote: File a bug report? Looks like it's already been wontfixed back in 2006: http://bugs.python.org/issue1501180 Absolutely bloody typical, turned down because of an idiot. Who the hell is Tim Peters anyway? :) I see your smiley, but for the benefit of those who actually don't know who Tim Peters, a.k.a. the Timbot, is, he is one of the gurus of Python history. He invented Python's astonishingly excellent sort routine, Timsort, and popularised the famous adverbial phrase signoffs you will see in a lot of older posts. Basically, he is in the pantheon of early Python demigods. stop-me-before-i-start-gushing-over-the-timbot-ly y'rs, -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On 31/10/2012 10:07, Steven D'Aprano wrote: On Tue, 30 Oct 2012 21:33:32 +, Mark Lawrence wrote: Absolutely bloody typical, turned down because of an idiot. Who the hell is Tim Peters anyway? :) I see your smiley, but for the benefit of those who actually don't know who Tim Peters, a.k.a. the Timbot, is, he is one of the gurus of Python history. He invented Python's astonishingly excellent sort routine, Timsort, and popularised the famous adverbial phrase signoffs you will see in a lot of older posts. Basically, he is in the pantheon of early Python demigods. stop-me-before-i-start-gushing-over-the-timbot-ly y'rs, 4 / 10, must try harder, the omission of the Zen of Python is considered a very serious matter :) -- Cheers. Mark Lawrence. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Wed, Oct 31, 2012 at 7:42 AM, Andrew Robinson andr...@r3dsolutions.comwrote: Then; I'd note: The non-goofy purpose of slice is to hold three data values; They are either numbers or None. These *normally* encountered values can't create a memory loop. So, FOR AS LONG, as the object representing slice does not contain an explicit GC pair; I move that we mandate (yes, in the current python implementation, even as a *fix*) that its named members may not be assigned any objects other than None or numbers eg: Lists would be forbidden Since functions, and subclasses, can be test evaluated by int( the_thing_to_try ) and *[] can too, generality need not be lost for generating nothing or numbers. PEP 357 requires that anything implementing the __index__ special method be allowed for slicing sequences (and also that __index__ be used for the conversion). For the most part, that includes ints and numpy integer types, but other code could be doing esoteric things with it. The change would be backward-incompatible in any case, since there is certainly code out there that uses non-numeric slices -- one example has already been given in this thread. And more wonderful yet, when I do extended slice replacement -- it gives me results beyond my wildest imaginings! a=[0,1,2,3,4,5] a[4:5]=range( 0, 3 ) # Size origin=1, Size dest =3 a [0, 1, 2, 3, 0, 1, 2, 5] # Insert on top of replacement But !!!NOT!!! if I do it this way: a[4]=range( 0, 3 ) a [0, 1, 2, 3, range(0, 3), 1, 2, 5] That's nothing to do with range or Python 3. It's part of the difference between slice assignment and index assignment. The former unpacks an iterable, and the latter assigns a single object. You'd get the same behavior with lists: a = list(range(6)) a[4:5] = list(range(3)) a [0, 1, 2, 3, 0, 1, 2, 5] a = list(range(6)) a[4] = list(range(3)) a [0, 1, 2, 3, [0, 1, 2], 5] Slice assignment unpacks the list; index assignment assigns the list itself at the index. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On 10/31/2012 02:20 PM, Ian Kelly wrote: On Wed, Oct 31, 2012 at 7:42 AM, Andrew Robinson wrote: Then; I'd note: The non-goofy purpose of slice is to hold three data values; They are either numbers or None. These *normally* encountered values can't create a memory loop. So, FOR AS LONG, as the object representing slice does not contain an explicit GC pair; I move that we mandate (yes, in the current python implementation, even as a *fix*) that its named members may not be assigned any objects other than None or numbers eg: Lists would be forbidden Since functions, and subclasses, can be test evaluated by int( the_thing_to_try ) and *[] can too, generality need not be lost for generating nothing or numbers. PEP 357 requires that anything implementing the __index__ special method be allowed for slicing sequences (and also that __index__ be used for the conversion). For the most part, that includes ints and numpy integer types, but other code could be doing esoteric things with it. I missed something... (but then that's why we're still talking about it...) Reading the PEP, it notes that *only* integers (or longs) are permitted in slice syntax. (Overlooking None, of course... which is strange...) The PEP gives the only exceptions as objects with method __index__. Automatically, then, an empty list is forbidden (in slice syntax). However, What you did, was circumvent the PEP by passing an empty list directly to slice(), and avoiding running it through slice syntax processing. So... Is there documentation suggesting that a slice object is meant to be used to hold anything other than what comes from processing a valid slice syntax [::]??. (we know it can be done, but that's a different Q.) The change would be backward-incompatible in any case, since there is certainly code out there that uses non-numeric slices -- one example has already been given in this thread. Hmmm. Now, I'm thinking -- The purpose of index(), specifically, is to notify when something which is not an integer may be used as an index; You've helpfully noted that index() also *converts* those objects into numbers. Ethan Fullman mentioned that he used the names of fields, instead of having to remember the _offsets_; Which means that his values _do convert_ to offset numbers His example was actually given in slice syntax notation [::]. Hence, his objects must have an index() method, correct?. Therefore, I still see no reason why it is permissible to assign non-numerical (non None) items as an element of slice(). Or, let me re-word that more clearly -- I see no reason that slice named members when used as originally intended would ever need to be assigned a value which is not *already* converted to a number by index(). By definition, if it can't be coerced, it isn't a number. A side note: At 80% less overhead, and three slots -- slice is rather attractive to store RGB values in for a picture! But, I don't think anyone would have a problem saying No, we won't support that, even if you do do it! So, what's the psychology behind allowing slice() to hold objects which are not converted to ints/longs in the first place? -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Mon, Oct 29, 2012 at 12:00 PM, Andrew Robinson andr...@r3dsolutions.com wrote: I downloaded the source code for python 3.3.0, as the tbz; In the directory Python-3.3.0/Python, look at Python-ast.c, line 2089 ff. Python-ast.c is part of the compiler code. That's not the struct used to represent the object at runtime, but the struct used to represent the AST node while compiling. For the runtime definition of slices, look at sliceobject.h and sliceobject.c. Slices are represented as: typedef struct { PyObject_HEAD PyObject *start, *stop, *step; /* not NULL */ } PySliceObject; PyObject_HEAD is a macro that incorporates the object type pointer and the reference count. Hence, 5 words. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
Hi Ian, There are several interesting/thoughtful things you have written. I like the way you consider a problem before knee jerk answering. The copying you mention (or realloc) doesn't re-copy the objects on the list. It merely re-copies the pointer list to those objects. So lets see what it would do... I have seen doubling as the supposed re-alloc method, but I'll assume 1.25 -- so, 1.25**x = 20million, is 76 copies (max). The final memory copy would leave about a 30MB hole. And my version of Python operates initially with a 7MB virtual footprint. Sooo If the garbage collection didn't operate at all, the copying would waste around: z,w = 30e6,0 while (z1): w,z = w+z, z/1.25 ... print(w) 14995.8589521 eg: 150MB cummulative. The doubles would amount to 320Megs max. Not enough to fill virtual memory up; nor even cause a swap on a 2GB memory machine. It can hold everything in memory at once. So, I don't think Python's memory management is the heart of the problem, although memory wise-- it does require copying around 50% of the data. As an implementation issue, though, the large linear array may cause wasteful caching/swapping loops, esp, on smaller machines. On 10/29/2012 10:27 AM, Ian Kelly wrote: Yes, I misconstrued your question. I thought you wanted to change the behavior of slicing to wrap around the end when start stop instead of returning an empty sequence. ... Chris has already given ... You could also use map for this: new_seq = list(map(old_seq.__getitem__, iterable)) MMM... interesting. I am not against changing the behavior, but I do want solutions like you are offering. As I am going to implement a python interpreter, in C, being able to do things differently could significantly reduce the interpreter's size. However, I want to break existing scripts very seldom... I'm aware of what is possible in C with pointer arithmetic. This is Python, though, and Python by design has neither pointers nor pointer arithmetic. In any case, initializing the pointer to the end of the array would still not do what you want, since the positive indices would then extend past the end of the array. Yes, *and* if you have done assembly language programming -- you know that testing for sign is a trivial operation. It doesn't even require a subtraction. Hence, at the most basic machine level -- changing the base pointer *once* during a slice operation is going to be far more efficient than performing multiple subtractions from the end of an array, as the Python API defines. I'll leave out further gory details... but it is a Python interpreter built in C issue. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Mon, Oct 29, 2012 at 4:39 PM, Andrew Robinson andr...@r3dsolutions.com wrote: In addition to those items you mention, of which the reference count is not even *inside* the struct -- there is additional debugging information not mentioned. Built in objects contain a line number, a column number, and a context pointer. These each require a full word of storage. Also, built in types appear to have a kind field which indicates the object type but is not a pointer. That suggests two object type indicators, a generic pointer (probably pointing to builtin? somewhere outside the struct) and a specific one (an enum) inside the C struct. Inside the tuple struct, I count 4 undocumented words of information. Over all, there is a length, the list of pointers, a kind, line, col and context; making 6 pieces in total. Although your comment says the head pointer is not required; I found in 3.3.0 that it is a true head pointer; The Tuple() function on line 2069 of Python-ast.c, (3.3 version) -- is passed in a pointer called *elts. That pointer is copied into the Tuple struct. As above, you're looking at the compiler code, which is why you're finding things like line and column. The tuple struct is defined in tupleobject.h and stores tuple elements in a tail array. How ironic, slices don't have debugging info, that's the main reason they are smaller. When I do slice(3,0,2), suprisingly Slice() is NOT called. But when I do a[1:2:3] it *IS* called. Because compiling the latter involves parsing slicing syntax, and compiling the former does not. :-) -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On 10/29/2012 04:01 PM, Ian Kelly wrote: On Mon, Oct 29, 2012 at 9:20 AM, Andrew Robinson andr...@r3dsolutions.com wrote: FYI: I was asking for a reason why Python's present implementation is desirable... I wonder, for example: Given an arbitrary list: a=[1,2,3,4,5,6,7,8,9,10,11,12] Why would someone *want* to do: a[-7,10] Instead of saying a[5:10] or a[-7:-2] ? A quick search of local code turns up examples like this: if name.startswith('{') and name.endswith('}'): name = name[1:-1] Which is done to avoid explicitly calling the len() operator. If slices worked like ranges, then the result of that would be empty, which is obviously not desirable. Yes, and that's an excellent point -- but note what I am showing in the example. It is that example, which I am specifying. There are only two cases where I think the default behavior of Python gives undesirable results: The step is positive, and the pair of indexes goes from negative to positive. Likewise, If the pair went from positive to negative, and the step was negative. In all other combinations, the default behavior of python ought to remain intact. I apologize for not making this crystal clear -- I thought you would focus on the specific example I gave. I don't know of a reason why one might need to use a negative start with a positive stop, though. I've already given several examples; and another poster did too -- eg: Gene sequences for bacteria. It's not uncommon to need this. If I do some digging, I can also show some common graphics operations that benefit greatly from this ability -- NOTE: in another thread I just showed someone how to operate on RGBA values... Slicing becomes THE major operation done when converting, or blitting, graphics data. etc. Another example -- Jpeg, for example, uses discrete cosines -- which are a naturally cyclic data type. They repeat with a fixed period. I know there are C libraries already made for Jpeg -- but that doesn't mean many other applications with no C library aren't plagued by this problem. I don't know how to make this point more clear. There really *ARE* applications that uses cyclic lists of data; or which can avoid extra logic to fix problems encountered from linear arrays which *end* at a particular point. sometimes it is desirable for a truncation to occur, sometimes it's NOT. The sign convention test I outlined, I believe, clearly detects when a cyclic data set is desired. If there are normal examples where my tests fail -- that's what's important to me. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On 10/29/2012 10:53 PM, Michael Torrie wrote: On 10/29/2012 01:34 PM, Andrew Robinson wrote: No, I don't think it big and complicated. I do think it has timing implications which are undesirable because of how *much* slices are used. In an embedded target -- I have to optimize; and I will have to reject certain parts of Python to make it fit and run fast enough to be useful. Since you can't port the full Python system to your embedded machine anyway, why not just port a subset of python and modify it to suit your needs right there in the C code. It would be a fork, yes, You're exactly right; That's what I *know* I am faced with. Without a libc, an MMU on the CPU, and a kernel, it's not going to just compile and run. I have libc. The MMU is a problem; but the compiler implements the standard C math library; floats, though, instead of doubles. That's the only problem -- there. What you want with slicing behavior changes has no place in the normal cPython implementation, for a lot of reasons. The main one is that it is already possible to implement what you are talking about in your own python class, which is a fine solution for a normal computer with memory and CPU power available. If the tests I outlined in the previous post inaccurately describe a major performance improvement and at least a modest code size reduction; or will *often* introduce bugs -- I *AGREE* with you. Otherwise, I don't. I don't think wasting extra CPU power is a good thing -- Extra CPU power can always be used by something else I won't belabor the point further. I'd love to see a counter example to the specific criteria I just provided to IAN -- it would end my quest; and be a good reference to point others to. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On 10/29/2012 11:51 PM, Ian Kelly wrote: On Mon, Oct 29, 2012 at 4:39 PM, Andrew Robinson As above, you're looking at the compiler code, which is why you're finding things like line and column. The tuple struct is defined in tupleobject.h and stores tuple elements in a tail array. If you re-check my post to chris, I listed the struct you mention. The C code is what is actually run (by GDB breakpoint test) when a tuple is instantiated. If the tuple were stripped of the extra data -- then it ought to be as small as slice(). But it's not as small -- so either the sys.getsizeof() is lying -- or the struct you mention is not complete. Which? --Andrew. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Mon, Oct 29, 2012 at 7:49 PM, Chris Kaynor ckay...@zindagigames.com wrote: NOTE: The above is taken from reading the source code for Python 2.6. For some odd reason, I am getting that an empty tuple consists of 6 pointer-sized objects (48 bytes on x64), rather than the expected 3 pointer-sized (24 bytes on x64). Slices are showing up as the expected 5 pointer-sized (40 bytes on x64), and tuples grow at the expected 1 pointer (8 bytes on x64) per item. I imagine I am missing something, but cannot figure out what that would be. I'm likewise seeing 4 extra words in tuples in 32-bit Python 3.3. What I've found is that for tuples and other collection objects, the garbage collector tacks on an extra header before the object in memory. That header looks like this: typedef union _gc_head { struct { union _gc_head *gc_next; union _gc_head *gc_prev; Py_ssize_t gc_refs; } gc; long double dummy; /* force worst-case alignment */ } PyGC_Head; gc_next and gc_prev implement a doubly-linked list that the garbage collector uses to explicitly track this object. gc_refs is used for counting references during a garbage collection and stores the GC state of the object otherwise. I'm not entirely certain why collection objects get this special treatment, but there you have it. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Mon, Oct 29, 2012 at 6:17 PM, Andrew Robinson andr...@r3dsolutions.com wrote: If you re-check my post to chris, I listed the struct you mention. The C code is what is actually run (by GDB breakpoint test) when a tuple is instantiated. When you were running GDB, were you debugging the interactive interpreter or a precompiled script? The interactive interpreter does a compilation step for every line entered. If the tuple were stripped of the extra data -- then it ought to be as small as slice(). But it's not as small -- so either the sys.getsizeof() is lying -- or the struct you mention is not complete. As just explained, the extra 16 bytes are added by the garbage collector. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Mon, Oct 29, 2012 at 5:54 PM, Andrew Robinson andr...@r3dsolutions.com wrote: I don't know of a reason why one might need to use a negative start with a positive stop, though. I've already given several examples; and another poster did too I meant that I don't know of a reason to do that given the existing semantics, which is what you were asking for. :-) I understand and agree that there are potential applications for allowing slices to wrap around from negative to positive. What I'm not convinced of is that these applications need or should be handled by the slicing operation -- which is more commonly understood as producing subsequences -- especially since they already can be handled relatively easily by iteration. I think that more users would find the behavior surprising than useful. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
By the way Andrew, the timestamps on your emails appear to be off, or possibly the time zone. Your posts are allegedly arriving before the posts you reply to, at least according to my news client. On Mon, 29 Oct 2012 12:34:24 -0700, Andrew Robinson wrote: On 10/29/2012 05:02 PM, Steven D'Aprano wrote: On Mon, 29 Oct 2012 08:42:39 -0700, Andrew Robinson wrote: But, why can't I just overload the existing __getitem__ for lists and not bother writing an entire class? You say that as if writing an entire class was a big complicated effort. It isn't. It is trivially simple, a single line: class MyList(list): ... No, I don't think it big and complicated. I do think it has timing implications which are undesirable because of how *much* slices are used. In an embedded target -- I have to optimize; and I will have to reject certain parts of Python to make it fit and run fast enough to be useful. Then I look forward to seeing your profiling results that show that the overhead of subclassing list is the bottleneck in your application. Until then, you are making the classic blunder of the premature optimizer: More computing sins are committed in the name of efficiency (without necessarily achieving it) than for any other single reason — including blind stupidity. — W.A. Wulf I am not impressed by performance arguments when you have (apparently) neither identified the bottlenecks in your code, nor even measured the performance. You are essentially *guessing* where the bottlenecks are, and *hoping* that some suggested change will be an optimization rather than a pessimization. Of course I may be wrong, and you have profiled your code and determined that the overhead of inheritance is a problem. If so, that's a different ball game. But your posts so far suggest to me that you're trying to predict performance optimizations rather than measure them. You can just overload that one method in a subclass of list. Being able to monkey-patch __getitem__ for the list class itself would not be advisable, as it would affect all list slicing anywhere in your program and possibly lead to some unexpected behaviors. That's what I am curious about. What unexpected behaviors would a monkey patch typically cause? What part of unexpected is unclear? Ahh -- The I don't know approach! It's only unexpected if one is a bad programmer...! No, it is unexpected because once you start relying on monkey-patching, and the libraries you install are monkey-patching, you have a combinational explosion of interactions. Any piece of code, anywhere, could monkey-patch any other piece of code -- it is a form of action-at-a- distance coding, like GOTOs and global variables. Use it with caution, in moderation. Let me see if I can illustrate a flavour of the sort of things that can happen if monkey-patching built-ins were allowed. [...] Right, which means that people developing the libraries made contradictory assumptions. Not necessarily. Not only can monkey-patches conflict, but they can combine in bad ways. It isn't just that Fred assumes X and Barney assumes not-X, but also that Fred assumes X and Barney assumes Y and *nobody* imagined that there was some interaction between X and Y. [...] Ruby allows monkey-patching of everything. And the result was predictable: http://devblog.avdi.org/2008/02/23/why-monkeypatching-is-destroying- ruby/ I read that post carefully; and the author purposely notes that he is exaggerating. Not carefully enough. He notes that he was using a provocative title and that he doesn't actually think that Ruby is being destroyed. But the actual harm he describes is real, e.g. bugs that take months to track down. What you are talking about is namespace preservation; I haven't mentioned namespaces. Nothing I have said has anything to do with namespaces. I remember Apple monkey-patching routines in ROM back in the mid 1980s, long before there was anything corresponding to namespaces in Apple's programming model. and I am thinking about it. I can preserve it -- but only if I disallow true Python primitives in my own interpreter; I can't provide two sets in the memory footprint I am using. If you want to write a language that is not Python, go right ahead. If someone had a clear explanation of the disadvantages of allowing an iterator, or a tuple -- in place of a slice() -- I would have no qualms dropping the subject. However, I am not finding that yet. I am finding very small optimization issues... Python has certain public interfaces. If your language does not support those public interfaces, then it might be an awesome language, but it is not Python. Python slices have certain features: * they can be used repeatedly; * they have public attributes start, step and stop; * The stop attribute can be None, and the slice will default to the length of the thing being sliced, which is not known until call-time. * Slices have a
Re: Negative array indicies and slice()
On Tue, Oct 30, 2012 at 1:21 AM, Ian Kelly ian.g.ke...@gmail.com wrote: I'm not entirely certain why collection objects get this special treatment, but there you have it. Thinking about it some more, this makes sense. The GC header is there to support garbage collection for the object. Atomic types like ints do not need this header because they do not reference other objects and so cannot be involved in reference cycles. For those types, reference counting is sufficient. For types like collections that do reference other objects, garbage collection is needed. Expanding on this, I suspect it is actually a bug that slice objects are not tracked by the garbage collector. The following code appears to result in a memory leak: import gc gc.disable() while True: for i in range(100): l = [] s = slice(l) l.append(s) del s, l _ = gc.collect() Try running that and watch your Python memory usage climb and climb. For contrast, replace the slice with a list and observe that memory usage does *not* climb. On each iteration, the code constructs a reference cycle between a slice and a list. It seems that because slices are not tracked by the garbage collector, it is unable to break these cycles. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furman et...@stoneleaf.us wrote: File a bug report? Looks like it's already been wontfixed back in 2006: http://bugs.python.org/issue1501180 -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On 10/30/2012 11:02 AM, Ian Kelly wrote: On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furmanet...@stoneleaf.us wrote: File a bug report? Looks like it's already been wontfixed back in 2006: http://bugs.python.org/issue1501180 Thanks, IAN, you've answered the first of my questions and have been a great help. (And yes, I was debugging interactive mode... I took a nap after writing that post, as I realized I had reached my 1 really bad post for the day... ) I at least I finally know why Python chooses to implement slice() as a separate object from tuple; even if I don't like the implications. I think there are three main consequences of the present implementation of slice(): 1) The interpreter code size is made larger with no substantial improvement in functionality, which increases debugging effort. 2) No protection against perverted and surprising (are you surprised?! I am) memory operation exists. 3) There is memory savings associated with not having garbage collection overhead. D'Apriano mentioned the named values, start, stop, step in a slice() which are an API and legacy issue; These three names must also be stored in the interpreter someplace. Since slice is defined at the C level as a struct, have you already found these names in the source code (hard-coded), or are they part of a .py file associated with the interface to the C code? -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
Andrew Robinson wrote: I can see that the slice() function can pass in arbitrary arguments. I'm not sure for lists, which is what the range is applied to, why an argument like a would be part of a slice. Well, in my dbf.py Record class you can use the names of fields as the slice arguments, instead of having to remember the offsets. record['full_name':'zip4'] returns a tuple (or a list, I don't remember) of about 13 fields -- this is especially useful as that block of fields might not be in the same place in each table. ~Ethan~ -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On 30/10/2012 18:02, Ian Kelly wrote: On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furman et...@stoneleaf.us wrote: File a bug report? Looks like it's already been wontfixed back in 2006: http://bugs.python.org/issue1501180 Absolutely bloody typical, turned down because of an idiot. Who the hell is Tim Peters anyway? :) -- Cheers. Mark Lawrence. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Tue, Oct 30, 2012 at 3:33 PM, Mark Lawrence breamore...@yahoo.co.uk wrote: On 30/10/2012 18:02, Ian Kelly wrote: On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furman et...@stoneleaf.us wrote: File a bug report? Looks like it's already been wontfixed back in 2006: http://bugs.python.org/issue1501180 Absolutely bloody typical, turned down because of an idiot. Who the hell is Tim Peters anyway? :) I don't really disagree with him, anyway. It is a rather obscure bug -- is it worth increasing the memory footprint of slice objects by 80% in order to fix it? -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Tue, Oct 30, 2012 at 8:21 AM, Andrew Robinson andr...@r3dsolutions.com wrote: D'Apriano mentioned the named values, start, stop, step in a slice() which are an API and legacy issue; These three names must also be stored in the interpreter someplace. Since slice is defined at the C level as a struct, have you already found these names in the source code (hard-coded), or are they part of a .py file associated with the interface to the C code? You mean the mapping of Python attribute names to C struct members? That's in sliceobject.c: static PyMemberDef slice_members[] = { {start, T_OBJECT, offsetof(PySliceObject, start), READONLY}, {stop, T_OBJECT, offsetof(PySliceObject, stop), READONLY}, {step, T_OBJECT, offsetof(PySliceObject, step), READONLY}, {0} }; -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Wed, Oct 31, 2012 at 8:47 AM, Ian Kelly ian.g.ke...@gmail.com wrote: On Tue, Oct 30, 2012 at 3:33 PM, Mark Lawrence breamore...@yahoo.co.uk wrote: On 30/10/2012 18:02, Ian Kelly wrote: On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furman et...@stoneleaf.us wrote: File a bug report? Looks like it's already been wontfixed back in 2006: http://bugs.python.org/issue1501180 Absolutely bloody typical, turned down because of an idiot. Who the hell is Tim Peters anyway? :) I don't really disagree with him, anyway. It is a rather obscure bug -- is it worth increasing the memory footprint of slice objects by 80% in order to fix it? Bug report: If I take this gun, aim it at my foot, and pull the trigger, sometimes a hole appears in my foot. This is hardly normal use of slice objects. And the penalty isn't a serious one unless you're creating cycles repeatedly. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Tue, Oct 30, 2012 at 3:55 PM, Ian Kelly ian.g.ke...@gmail.com wrote: On Tue, Oct 30, 2012 at 8:21 AM, Andrew Robinson andr...@r3dsolutions.com wrote: D'Apriano mentioned the named values, start, stop, step in a slice() which are an API and legacy issue; These three names must also be stored in the interpreter someplace. Since slice is defined at the C level as a struct, have you already found these names in the source code (hard-coded), or are they part of a .py file associated with the interface to the C code? You mean the mapping of Python attribute names to C struct members? That's in sliceobject.c: static PyMemberDef slice_members[] = { {start, T_OBJECT, offsetof(PySliceObject, start), READONLY}, {stop, T_OBJECT, offsetof(PySliceObject, stop), READONLY}, {step, T_OBJECT, offsetof(PySliceObject, step), READONLY}, {0} }; Note that the slice API also includes the slice.indices method. They also implement rich comparisons, but this appears to be done by copying the data to tuples and comparing the tuples, which is actually a bit ironic considering this discussion. :-) -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On 10/30/2012 01:17 AM, Steven D'Aprano wrote: By the way Andrew, the timestamps on your emails appear to be off, or possibly the time zone. Your posts are allegedly arriving before the posts you reply to, at least according to my news client. :D -- yes, I know about that problem. Every time I reboot it shows up again... It's a distribution issue, my hardware clock is in local time -- but when the clock is read by different scripts in my distribution, some refuse to accept that the system clock is not UTC. I'll be upgrading in a few weeks -- so I'm just limping along until then. My apology. Then I look forward to seeing your profiling results that show that the overhead of subclassing list is the bottleneck in your application. Until then, you are making the classic blunder of the premature optimizer: More computing sins are committed in the name of efficiency (without necessarily achieving it) than for any other single reason — including blind stupidity. — W.A. Wulf I'm sure that's true. Optimization, though, is a very general word. On a highway in my neighborhood -- the government keeps trying to put more safety restrictions on it, because it statistically registers as the highest accident rate road in the *entire* region. Naturally, the government assumes that people in my neighborhood are worse drivers than usual and need to be policed more -- but the truth is, that highway is the *ONLY* access road in the region for dozens of miles in any direction for a densely populated area, so if there is going to be an accident it will happen there; the extra safety precautions are not necessary when the accident rate is looked at from a per-capita perspective of those driving the highway. I haven't made *the* blunder of the premature optimizer because I haven't implemented anything yet. Premature optimizers don't bother to hold public conversation and take correction. OTOH: people who don't ever optimize out of fear, pay an increasing bloat price with time. I am not impressed by performance arguments when you have (apparently) neither identified the bottlenecks in your code, nor even measured the performance. Someone else already did a benchmark between a discrete loop and a slice operation. The difference in speed was an order of magnitude different. I bench-marked a map operation, which was *much* better -- but also still very slow in comparison. Let's not confound an issue here -- I am going to implement the python interpreter; and am not bound by optimization considerations of the present python interpreter -- There are things I can do which as a python programmer -- you can't. I have no choice but to re-implement and optimize the interpreter -- the question is merely how to go about it. You are essentially *guessing* where the bottlenecks are, and *hoping* that some suggested change will be an optimization rather than a pessimization. Of course I may be wrong, and you have profiled your code and determined that the overhead of inheritance is a problem. If so, that's a different ball game. But your posts so far suggest to me that you're trying to predict performance optimizations rather than measure them. Not really; Inheritance itself and it's timing aren't my main concern. Even if the time was *0* that wouldn't change my mind. There are man hours in debugging time caused by not being able to wrap around in a slice. (I am not ignoring the contrary man hours of an API change's bugs). Human psychology is important; and it's a double edged sword. I would refer you to a book written by Steve Maguire, Writing Solid Code; Chapter 5; Candy machine interfaces. He uses the C function realloc() as an excellent example of a bad API; but still comments on one need that it *does* fulfill -- I've found it better to have one function that both shrinks and expands blocks so that I don't have to write *ifs* constructs every time I need to resize memory. True, I give up some extra argument checking, but this is offset by the *ifs* that I no longer need to write (*and possibly mess up*). * Extra steps that a programmer must take to achieve a task are places where bugs get introduced. * API's which must be debugged to see what particular operation it is performing rather than knowing what that operation is from looking at the un-compiled code are places where bugs get introduced. These two points are not friendly with each other -- they are in fact, generally in conflict. Right, which means that people developing the libraries made contradictory assumptions. Not necessarily. Not only can monkey-patches conflict, but they can combine in bad ways. It isn't just that Fred assumes X and Barney assumes not-X, but also that Fred assumes X and Barney assumes Y and *nobody* imagined that there was some interaction between X and Y. They *STILL* made contradictory assumptions; each of them assumed the interaction mechanism would not be applied in a
Re: Negative array indicies and slice()
On 30/10/2012 21:47, Ian Kelly wrote: On Tue, Oct 30, 2012 at 3:33 PM, Mark Lawrence breamore...@yahoo.co.uk wrote: On 30/10/2012 18:02, Ian Kelly wrote: On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furman et...@stoneleaf.us wrote: File a bug report? Looks like it's already been wontfixed back in 2006: http://bugs.python.org/issue1501180 Absolutely bloody typical, turned down because of an idiot. Who the hell is Tim Peters anyway? :) I don't really disagree with him, anyway. It is a rather obscure bug -- is it worth increasing the memory footprint of slice objects by 80% in order to fix it? Thinking about it I entirely agree. An 80% increase in memory foorprint where the slice objects are being used with Python 3.3.0 Unicode would have disastrous consequences given the dire state of said Unicode, which is why some regular contributors here are giving up with Python and using Go. Oh gosh look at the time, I'm just going for a walk so I can talk with the Pixies at the bottom of my garden before they go night nights. -- Cheers. Mark Lawrence. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On 30/10/2012 15:47, Andrew Robinson wrote: I would refer you to a book written by Steve Maguire, Writing Solid Code; Chapter 5; Candy machine interfaces. The book that took a right hammering here http://accu.org/index.php?module=bookreviewsfunc=searchrid=467 ? -- Cheers. Mark Lawrence. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On 10/30/2012 04:48 PM, Mark Lawrence wrote: On 30/10/2012 15:47, Andrew Robinson wrote: I would refer you to a book written by Steve Maguire, Writing Solid Code; Chapter 5; Candy machine interfaces. The book that took a right hammering here http://accu.org/index.php?module=bookreviewsfunc=searchrid=467 ? Yes, although Chapter 5 is the one where the realloc() issue is discussed. If you have a library, see if you can check the book out -- rather than spend $$$ on it. But, in good humor -- Consider the only criticism the poster mentioned about chapter 5's contents. Occasionally, he presents a code fragment with a subtle bug, such as: p = realloc(p,n); _I have to admit that I didn't spot the bug_, but then I never use realloc, knowing it to have pitfalls. What is the bug? If realloc cannot allocate the memory, it returns NULL and the assignment means you lose your original pointer. What are the pitfalls? Realloc may or may not copy the data to a new, larger, area of memory and return the address of that: many programmers forget this and end up with pointers into the old, deallocated, area. Even those programmers who remember will likely fall into the trap that Maguire shows. _Back to 'clever' code though, he prefers_: His critique is a bit like the scene in Monty Python's the Life of Br..an... Where the aliens come, and crash, and leave -- and is totally irrelevant to what the plot-line is in the movie. What does this comment have to do with a critique??? McGuire didn't fail to notice the bug! But the critic doesn't even notice the main *pointS* the author was trying to make in that chapter. There are, to be sure, recommendations that I don't agree with in the book; He doesn't seem to do much Unit testing, postmortems, etc. are all topics that I studied in a formal class on Software Engineering. It was a wider perspective than McGuire brings to his book; But that doesn't mean McGuire has nothing valuable to say! A short Python Homage for readers of the linked Critique! : I've had to follow GPL project style rules where the rule for a weird situation would be: while (*condition) /* nothing */ ; // and yes, this will sometimes generate a warning... But, I have enough brains to take McGuire's *suggestion* to an improved Python conclusion. #define PASS(x) {(void)NULL;} while (*condition) PASS( Gas ); // There will be no warning -- http://mail.python.org/mailman/listinfo/python-list
RE: Negative array indicies and slice()
Ian, Looks like it's already been wontfixed back in 2006: http://bugs.python.org/issue1501180 Absolutely bloody typical, turned down because of an idiot. Who the hell is Tim Peters anyway? I don't really disagree with him, anyway. It is a rather obscure bug -- is it worth increasing the memory footprint of slice objects by 80% in order to fix it? :D In either event, a *bug* does exist (at *least* 20% of the time.) Tim Peters could have opened the *appropriate* bug complaint if he rejected the inappropriate one. The API ought to have either 1) included the garbage collection, or 2) raised an exception anytime dangerous/leaky data was supplied to slice(). If it is worth getting rid of the 4 words of extra memory required for the GC -- on account of slice() refusing to support data with sub-objects; then I'd also point out that a very large percentage of the time, tuples also contain data (typically integers or floats,) which do not further sub-reference objects. Hence, it would be worth it there too. OTOH, if the GC is considered acceptable in non-sub-referenced tuples, GC ought to be acceptable in slice() as well. Inconsistency is the mother of surprises; and code bloat through exceptions Note that the slice API also includes the slice.indices method. They also implement rich comparisons, but this appears to be done by copying the data to tuples and comparing the tuples, which is actually a bit ironic considering this discussion. Yes, indeed! I didn't mention the slice.indicies method -- as it's purpose is traditionally to *directly* feed the parameters of xrange or range. ( I thought that might start a WAR! ). :D http://docs.python.org/release/2.3.5/whatsnew/section-slices.html class FakeSeq: ... return FakeSeq([self.calc_item(i) for i in_range(*indices)_]) else: return self.calc_item(i) And here I'm wondering why we can't just pass range into it directly... :( I came across some unexpected behavior in Python 3.2 when experimenting with ranges and replacement Consider, xrange is missing, BUT: a=range(1,5,2) a[1] 3 a[2] 5 a[1:2] range(3, 5, 2) Now, I wondered if it would still print the array or not; eg: if this was a __str__ issue vs. __repr__. print( a[1:2] ) # Boy, I have to get used to the print's parenthesis range(3, 5, 2) So, the answer is *NOPE*. I guess I need to read the doc's all over again... it's ... well, quite different. --Andrew. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On 10/30/2012 09:47 AM, Andrew Robinson wrote: Let's not confound an issue here -- I am going to implement the python interpreter; and am not bound by optimization considerations of the present python interpreter -- There are things I can do which as a python programmer -- you can't. I have no choice but to re-implement and optimize the interpreter -- the question is merely how to go about it. As this is the case, why this long discussion? If you are arguing for a change in Python to make it compatible with what this fork you are going to create will do, this has already been fairly thoroughly addressed earl on, and reasons why the semantics will not change anytime soon have been given. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Sunday, October 28, 2012 9:26:01 PM UTC-7, Ian wrote: On Sun, Oct 28, 2012 at 10:00 PM, Andrew wrote: Hi Ian, Well, no it really isn't equivalent. Consider a programmer who writes: xrange(-4,3) *wants* [-4,-3,-2,-1,0,1,2] That is the idea of a range; for what reason would anyone *EVER* want -4 to +3 to be 6:3??? That is what ranges do, but your question was about slices, not ranges. Actually, I said in the OP: I also don't understand why slice() is not equivalent to an iterator, but can replace an integer in __getitem__() whereas xrange() can't. = Thank you for the code snippet; I don't think it likely that existing programs depend on nor use a negative index and a positive index expecting to take a small chunk in the center... hence, I would return the whole array; Or if someone said [-len(listX) : len(listX)+1 ] I would return the whole array twice. That's the maximum that is possible. If someone could show me a normal/reasonable script which *would* expect the other behavior, I'd like to know; compatibility is important. = My intended inferences about the iterator vs. slice question was perhaps not obvious to you; Notice: an iterator is not *allowed* in __getitem__(). The slice class when passed to __getitem__() was created to merely pass two numbers and a stride to __getitem__; As far as I know slice() itself does *nothing* in the actual processing of the elements. So, it's *redundant* functionality, and far worse, it's restrictive. The philosophy of Python is to have exactly one way to do something when possible; so, why create a stand alone class that does nothing an existing class could already do, and do it better ? A simple list of three values would be just as efficient as slice()! xrange is more flexible, and can be just as efficient. So, Have I misunderstood the operation of slice()? I think I might have... but I don't know. In 'C', where Python is written, circularly linked lists -- and arrays are both very efficient ways of accessing data. Arrays can, in fact, have negative indexes -- perhaps contrary to what you thought. One merely defines a variable to act as the base pointer to the array and initialize it to the *end* of the array. Nor is the size of the data elements an issue, since in Python all classes are accessed by pointers which are of uniform size. I routinely do this in C. Consider, also, that xrange() does not actually create a list -- but merely an iterator generating integers which is exactly what __getitem__ works on. So, xrange() does not need to incur a memory or noticeable time penalty. From micro-python, it's clear that their implementation of xrange() is at the 'C' level; which is extremely fast. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Sunday, October 28, 2012 10:14:03 PM UTC-7, Paul Rubin wrote: Andrew writes: So: Why does python choose to convert them to positive indexes, and have slice operate differently than xrange There was a thread a few years back, I think started by Bryan Olson, that made the case that slice indexing is a Python wart for further reasons than the above, and suggesting a notation like x[$-5] to denote what we now call x[-5] (i.e. $ is the length of the string). So your example x[$-4:3] would clearly be the same thing as x[6:3] and not give any suggestion that it might wrap around. I'm getting very frustrated with the editor provided for this group... It keeps posting prematurely, and putting my email in even when I tell it not to each time; and there is no way to edit a post... but deleting is ok... I think Olson makes a good point. The len() operator is so ubiquitous that it would be very useful to have a shorthand like that. I'll have to look for his thread. I'm thinking that I might just patch my version of Python 3.x, in C, to allow iterators to be passed to __getitem__; I haven't ever seen someone wanting to use mixed sign indexes to extract a small chunk of an array in the middle; so I don't think my patch will break existing code. The snippets of code given by other posters in the thread might also be used to make a compatibility wrapper; I'll have to study it closer; so that distributed code would still work on unpatched python, albeit much slower. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Mon, Oct 29, 2012 at 12:54 AM, Andrew andrewr3m...@gmail.com wrote: On Sunday, October 28, 2012 9:26:01 PM UTC-7, Ian wrote: On Sun, Oct 28, 2012 at 10:00 PM, Andrew wrote: snip The slice class when passed to __getitem__() was created to merely pass two numbers and a stride to __getitem__; As far as I know slice() itself does *nothing* in the actual processing of the elements. So, it's *redundant* functionality, and far worse, it's restrictive. The philosophy of Python is to have exactly one way to do something when possible; so, why create a stand alone class that does nothing an existing class could already do, and do it better ? A simple list of three values would be just as efficient as slice()! xrange is more flexible, and can be just as efficient. So, Have I misunderstood the operation of slice()? I think I might have... but I don't know. `slice` is intentionally lenient about the types of the start, stop, and step: class Foo: ... def __getitem__(self, slice_): ... print(slice_) ... return 42 ... Foo()[a:b:c] slice('a', 'b', 'c') 42 Thus, the thing being sliced is free to interpret the parts of the slice however it wishes; hence, slice() is unable to contain the processing you speak of. By contrast, xrange() limits itself to integers. To support the more general case, the slice syntax thus produces a `slice` rather than an `xrange`. Doubtlessly, there are also historical issues involved. As implied by the ugliness of its name, `xrange` was added to the language relatively later. Cheers, Chris -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Sunday, October 28, 2012 9:44:56 PM UTC-7, alex23 wrote: On Oct 29, 2:09 pm, Andrew andrewr3m...@gmail.com wrote: I use this arbitrary range code *often* so I need a general purpose solution. I looked up slice() but the help is of no use, I don't even know how I might overload it to embed some logic to concatenate ranges of data; nor even if it is possible. Slices are passed in if provided to __getitem__/__setitem__/ __delitem__, so you'd need to override it at the list level: class RangedSlicer(list): def __getitem__(self, item): # map item.start, .stop and .step to your own semantics Then wrap your lists with your RangedSlicer class as needed. Hmmm... I began a test in an interactive shell: class RangedSlicer(list): ... def __getitem__(self,item): ... print item ... a=[1,2,3,4,5] a.__getitem__( slice(1,5) ) [2, 3, 4, 5] Very odd... I would have expected [1,2,3,4] a.__getitem__( slice(1,8) ) [2, 3, 4, 5] So, slice() somehow was truncated although it ought to have been executed first, and passed to __getitem__() before __getitem__ could affect it. That requires some tricky programming! Not only that, but, a.__getitem__( xrange[1,8] ) Causes an exception before the __getitem__ shadowing received it. I don't see how I can over-ride it with your suggestion, but that's very inconsistent for your idea seems to be normal python that would work for user defined classes. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Mon, Oct 29, 2012 at 1:08 AM, andrewr3m...@gmail.com wrote: On Sunday, October 28, 2012 10:14:03 PM UTC-7, Paul Rubin wrote: Andrew writes: snip I'm getting very frustrated with the editor provided for this group... It keeps posting prematurely, and putting my email in even when I tell it not to each time; and there is no way to edit a post... but deleting is ok... This is a Usenet newsgroup[1], not a web forum. There are noteworthy differences between the two. FWICT, you happen to be accessing us via Google Groups, which is widely acknowledged to suck. We are not hosted *by* Google Groups; they just happen to carry our posts. Personally, I'd suggest using our mailing list mirror instead: http://mail.python.org/mailman/listinfo/python-list Or use some other, better newsgroup provider that also carries us. [1]: http://en.wikipedia.org/wiki/Usenet Regards, Chris -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Mon, Oct 29, 2012 at 1:24 AM, andrewr3m...@gmail.com wrote: On Sunday, October 28, 2012 9:44:56 PM UTC-7, alex23 wrote: On Oct 29, 2:09 pm, Andrew andrewr3m...@gmail.com wrote: snip class RangedSlicer(list): snip Then wrap your lists with your RangedSlicer class as needed. Hmmm... I began a test in an interactive shell: class RangedSlicer(list): ... def __getitem__(self,item): ... print item … This just defines a class; it doesn't modify in-place the normal behavior of plain lists. You have to actually *use* the class. a=[1,2,3,4,5] You never wrapped `a` in a RangedSlicer or otherwise made use of RangedSlicer! You wanted: a = RangedSlicer([1,2,3,4,5]) a.__getitem__( slice(1,5) ) [2, 3, 4, 5] Very odd... I would have expected [1,2,3,4] [2, 3, 4, 5] is the return value from `a.__getitem__( slice(1,5) )` (or, equivalently, from `[1,2,3,4,5][1:5]`). It is not the result of print item; that line of code is never executed since you never used the RangedSlicer class at all. Regards, Chris -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Monday, October 29, 2012 1:38:04 AM UTC-7, Chris Rebert wrote: On Mon, Oct 29, 2012 at 1:24 AM, On Sunday, October 28, 2012 9:44:56 PM UTC-7, alex23 wrote: On Oct 29, 2:09 pm, Andrew wrote: You never wrapped `a` in a RangedSlicer or otherwise made use of RangedSlicer! You wanted: a = RangedSlicer([1,2,3,4,5]) a.__getitem__( slice(1,5) ) [2, 3, 4, 5] Very odd... I would have expected [1,2,3,4] [2, 3, 4, 5] is the return value from `a.__getitem__( slice(1,5) )` (or, equivalently, from `[1,2,3,4,5][1:5]`). It is not the result of print item; that line of code is never executed since you never used the RangedSlicer class at all. Regards, Chris My apology --- I deleted that post; yet it didn't delete... I saw my mistake seconds after posting. * gmail. Note: I subscribed to the python-list, and am able to recieve e-mails, but I don't see how to write a post for this particular thread nor subscribe to this particular thread... A brief suggestion, or link to a howto would be *much* appreciated. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On 29/10/2012 08:59, andrewr3m...@gmail.com wrote: Note: I subscribed to the python-list, and am able to recieve e-mails, but I don't see how to write a post for this particular thread nor subscribe to this particular thread... A brief suggestion, or link to a howto would be *much* appreciated. Get yourself a decent email client. I read all the Python lists that I'm interested in using Thunderbird on Windows via gmane. -- Cheers. Mark Lawrence. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
Ok, hopefully this is better. I love my own e-mail editor... I can see that the slice() function can pass in arbitrary arguments. I'm not sure for lists, which is what the range is applied to, why an argument like a would be part of a slice. I *really* don't see what the advantage of a slice class is over a mere list in the order of start, stop, step eg: [ 1,4,9 ] In a dictionary, where a could be a key -- I wasn't aware that there was a defined order that the idea of slice could apply to. When I look at the documentation, http://www.python.org/doc//current/c-api/slice The only thing that slice has which is special, is that the the length of the sequence can be given -- and the start and stop index are either trimmed or an error (exception???) is thrown. Where is the information on the more general case of slice()? :-\ I am thinking, can one use the 'super' type of access, to override -- within the list object itself -- the __getitem__ method, and after pre-processing -- call the shadowed method with the modified parameters? That would allow me to use the normal a[-4:6] notation, without having to write a wrapper class that must be explicitly called. I'm thinking something like, PyListObject.__getitem__= lambda self, slice: --Andrew. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On 29/10/2012 02:31, Andrew Robinson wrote: Ok, hopefully this is better. I love my own e-mail editor... I can see that the slice() function can pass in arbitrary arguments. I'm not sure for lists, which is what the range is applied to, why an argument like a would be part of a slice. I *really* don't see what the advantage of a slice class is over a mere list in the order of start, stop, step eg: [ 1,4,9 ] In a dictionary, where a could be a key -- I wasn't aware that there was a defined order that the idea of slice could apply to. When I look at the documentation, http://www.python.org/doc//current/c-api/slice The only thing that slice has which is special, is that the the length of the sequence can be given -- and the start and stop index are either trimmed or an error (exception???) is thrown. Where is the information on the more general case of slice()? :-\ I am thinking, can one use the 'super' type of access, to override -- within the list object itself -- the __getitem__ method, and after pre-processing -- call the shadowed method with the modified parameters? That would allow me to use the normal a[-4:6] notation, without having to write a wrapper class that must be explicitly called. I'm thinking something like, PyListObject.__getitem__= lambda self, slice: --Andrew. I suggest that you go back and read the tutorial about slicing. I say this because we've started with negative array indicies and slice() (but Python arrays haven't been mentioned :), then moved onto (x)range and now lists, dictionaries and the C API for slices. An alternative is to tell us precisely what you're trying to achieve. The odds are that there's a simple answer waiting in the wings for a simple question. -- Cheers. Mark Lawrence. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Mon, 29 Oct 2012 01:59:06 -0700, andrewr3mail wrote: Note: I subscribed to the python-list, and am able to recieve e-mails, but I don't see how to write a post for this particular thread nor subscribe to this particular thread... The beauty of email is that you don't have to subscribe to a thread. Once you subscribe to the mailing list, email is delivered into your inbox. To reply to it, just reply to it. To ignore it, throw it in the trash. Gmail should have a button or three that say Reply to email or similar. You want the button that says Reply to All or Reply to List. Make sure that the reply includes python-list@python.org as a recipient. Delete bits of the quoted email (the lines that start with characters) that are no longer relevant to the conversation. Type your reply. Double check that the reply is going to python-list. Then hit Send. (P.S. when you signed up for python-list@python.org, if you selected the option to receive a single daily digest instead of individual emails, you're going to have a bad time. Do yourself a favour -- and the rest of us -- and change back to individual emails.) -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Mon, 29 Oct 2012 00:54:29 -0700, Andrew wrote: Actually, I said in the OP: I also don't understand why slice() is not equivalent to an iterator, but can replace an integer in __getitem__() whereas xrange() can't. Slices and iterators have different purposes and therefore have not been made interchangeable. Yes, there are certain similarities between a slice and xrange, but there are also significant differences. Thank you for the code snippet; I don't think it likely that existing programs depend on nor use a negative index and a positive index expecting to take a small chunk in the center... On the contrary. That is the most straightforward and useful idea of slicing, to grab a contiguous slice of items. Why would you want to grab a slice from the end of the list, and a slice from the start of the list, and swap them around? Apart from simulating card shuffles and cuts, who does that? hence, I would return the whole array; Or if someone said [-len(listX) : len(listX)+1 ] I would return the whole array twice. Well, that's one possible interpretation, but it is not the one that Python uses. When you create your own language, you can choose whatever interpretation seems most useful to you too. That's the maximum that is possible. If someone could show me a normal/reasonable script which *would* expect the other behavior, I'd like to know; compatibility is important. I'm not entirely sure I understand what you are asking here. My intended inferences about the iterator vs. slice question was perhaps not obvious to you; Notice: an iterator is not *allowed* in __getitem__(). Actually, you can write __getitem__ for your own classes to accept anything you like. py class Test: ... def __getitem__(self, index): ... return index ... py t = Test() py t[Hello world] 'Hello world' py t[{'x': None}] {'x': None} The slice class when passed to __getitem__() was created to merely pass two numbers and a stride to __getitem__; As far as I know slice() itself does *nothing* in the actual processing of the elements. So, it's *redundant* functionality, and far worse, it's restrictive. You say that as if it were a bad thing. The philosophy of Python is to have exactly one way to do something when possible; so, why create a stand alone class that does nothing an existing class could already do, and do it better ? What existing class is that? It certainly isn't xrange. Because xrange represents a concrete sequence of numbers, all three of start, end and stride must be concrete, known, integers: py xrange(4, None, 2) Traceback (most recent call last): File stdin, line 1, in module TypeError: an integer is required Whereas slices can trivially include blanks that get filled in only when actually used: py hello world[aslice] 'owrd' py NOBODY expects the Spanish Inquisition![aslice] 'D xet h pns nusto!' So, no, xrange is no substitute for slices. Not even close. A simple list of three values would be just as efficient as slice()! On the contrary, a simple list of three values not only could not do everything a slice does, but it's over twice the size! py sys.getsizeof([1, 2, 3]) 44 py sys.getsizeof(slice(1, 2, 3)) 20 xrange is more flexible, and can be just as efficient. Less flexible, less efficient. [snip] In 'C', where Python is written, That's a popular misapprehension. Python is written in Java, or Lisp, or Haskell, or CLR (dot Net), or RPython, or Ocaml, or Parrot. Each of those languages have, or had, at least one Python implementation. Oh, there's also a version written in C, or so I have heard. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Mon, Oct 29, 2012 at 10:19 PM, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: In 'C', where Python is written, That's a popular misapprehension. Python is written in Java, or Lisp, or Haskell, or CLR (dot Net), or RPython, or Ocaml, or Parrot. Each of those languages have, or had, at least one Python implementation. Oh, there's also a version written in C, or so I have heard. And that's not including the human-brain implementation, perhaps the most important of all. Although the current port of Python to my brain isn't quite a complete implementation, lacking a few bits that I should probably get to at some point, but even so, it's as useful to me as firing up IDLE. I wonder if what the OP is looking for is not slicing, but something more akin to map. Start with a large object and an iterator that produces keys, and create an iterator/list of their corresponding values. Something like: a=[1,2,3,4,5,6,7,8,9,10] b=[a[i] for i in xrange(-4,3)] It's not strictly a slice operation, but it's a similar sort of thing, and it can do the wraparound quite happily. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On 10/29/2012 04:32 AM, Chris Angelico wrote: I wonder if what the OP is looking for is not slicing, but something more akin to map. Start with a large object and an iterator that produces keys, and create an iterator/list of their corresponding values. Something like: a=[1,2,3,4,5,6,7,8,9,10] b=[a[i] for i in xrange(-4,3)] It's not strictly a slice operation, but it's a similar sort of thing, and it can do the wraparound quite happily. ChrisA A list comprehension ? That does do what I am interested in, *very* much so. Quite a gem, Chris! :-\ I am curious as to how quickly it constructs the result compared to a slice operation. Eg: a[1:5] vs. [ a[i] for i in xrange[1:5] ] But, unless it were grossly slower -- so that if/then logic and slices were generally faster -- I will use it. Thanks. --Andrew. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Mon, Oct 29, 2012 at 3:52 PM, Andrew Robinson andr...@r3dsolutions.com wrote: I am curious as to how quickly it constructs the result compared to a slice operation. Eg: a[1:5] vs. [ a[i] for i in xrange[1:5] ] For the most part, don't concern yourself with performance. Go with functionality and readability. In the trivial case shown here, the slice is WAY clearer, so it should definitely be the one used; in other cases, the slice might simply be insufficient, so you go with whatever achieves your goal. Performance will usually be good enough, even if there's a marginally faster way. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On 10/29/2012 04:19 AM, Steven D'Aprano wrote: On Mon, 29 Oct 2012 00:54:29 -0700, Andrew wrote: Slices and iterators have different purposes and therefore have not been made interchangeable. Yes, there are certain similarities between a slice and xrange, but there are also significant differences. Aha, now were getting to the actual subject. [snip] In 'C', where Python is written, That's a popular misapprehension. Python is written in Java, or Lisp, or Haskell, or CLR (dot Net), or RPython, or Ocaml, or Parrot. Each of those languages have, or had, at least one Python implementation. Oh, there's also a version written in C, or so I have heard. :-P I didn't say it was only written in C, but in C where it is implemented. I will be porting Python 3.xx to a super low power embedded processor (MSP430), both space and speed are at a premium. Running Python on top of Java would be a *SERIOUS* mistake. .NET won't even run on this system. etc. Thank you for the code snippet; I don't think it likely that existing programs depend on nor use a negative index and a positive index expecting to take a small chunk in the center... On the contrary. That is the most straightforward and useful idea of slicing, to grab a contiguous slice of items. Show me an example where someone would write a slice with a negative and a positive index (both in the same slice); and have that slice grab a contiguous slice in the *middle* of the list with orientation of lower index to greater index. I have asked before; It's not that I don't think it possible -- it's that I can't imagine a common situation. Why would you want to grab a slice from the end of the list, and a slice from the start of the list, and swap them around? Apart from simulating card shuffles and cuts, who does that? Advanced statistics programmers using lookup tables that are symmetrical. Try Physicists too -- but they're notably weird. My intended inferences about the iterator vs. slice question was perhaps not obvious to you; Notice: an iterator is not *allowed* in __getitem__(). Actually, you can write __getitem__ for your own classes to accept anything you like. Yes, I realize that. But, why can't I just overload the existing __getitem__ for lists and not bother writing an entire class? Everything in Python is supposed to be an object and one of the big supposed selling points is the ability to overload any object's methods. The lists aren't special -- they're just a bunch of constant decimal numbers, typically given as a large tuple. py class Test: ... def __getitem__(self, index): ... return index ... Better: class Test: ... def __getitem__( self, *index ): ... return index No extra curlies required... You say that as if it were a bad thing. hmmm... and you as if sarcastic? :-) It is a bad thing to have any strictly un-necessary and non-code saving objects where memory is restricted. What existing class is that? It certainly isn't xrange. Because xrange represents a concrete sequence of numbers, all three of start, end and stride must be concrete, known, integers: Let's up the ante. I'll admit xrange() won't do later fill in the blank -- BUT -- xrange() is a subclass of an *existing* class called iterator. Iterators are very general. They can even be made random. The philosophy of Python is to have exactly one way to do something when possible; so, why create a stand alone class that does nothing an existing class could already do, and do it better ? py xrange(4, None, 2) Traceback (most recent call last): File stdin, line 1, inmodule TypeError: an integer is required Hmmm.. Let's try your example exactly as shown... hello world[aslice] Traceback (most recent call last): File stdin, line 1, in module NameError: name 'aslice' is not defined WOW. Cool. Where did the blanks *actually* get filled in? Or HOW WILL they in your next post? On the contrary, a simple list of three values not only could not do everything a slice does, but it's over twice the size! Yeah, There is a definite issue there. But the case isn't decided by that number alone. A slice is storing three integers -- and an integer is size is 12. So, slices don't use integers. If the type that *IS* used happens to be a real Python type, we may merely typecast integers to that type -- insert them in a tuple and by definition, they must be the same size. Looking at some of the online programming notes -- a slice apparently doesn't use an integer storage variable that is capable of arbitrary expansion. =-O -- and hence, won't work for very large sized lists. That actually explains some crashes I have noted in the past when working with 20 million element lists that I wanted a slice of. I had *plenty* of ram on that system. Besides: The program code to implement slice() is undoubtedly larger than 12 bytes of savings! How many slices() are typically found in memory simultaneously?
Re: Negative array indicies and slice()
On Mon, Oct 29, 2012 at 5:01 PM, Andrew Robinson andr...@r3dsolutions.com wrote: Looking at some of the online programming notes -- a slice apparently doesn't use an integer storage variable that is capable of arbitrary expansion. =-O -- and hence, won't work for very large sized lists. That actually explains some crashes I have noted in the past when working with 20 million element lists that I wanted a slice of. I had *plenty* of ram on that system. Can you provide links to these notes? I'm looking at cpython/Include/sliceobject.h that has this comment: /* A slice object containing start, stop, and step data members (the names are from range). After much talk with Guido, it was decided to let these be any arbitrary python type. Py_None stands for omitted values. */ Also, the code for slice objects in CPython works with Py_ssize_t (a signed quantity of the same length as size_t), which will allow at least 2**31 for an index. I would guess that your crashes were nothing to do with 20 million elements and slices. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
In article mailman.3009.1351516065.27098.python-l...@python.org, Andrew Robinson andr...@r3dsolutions.com wrote: Show me an example where someone would write a slice with a negative and a positive index (both in the same slice); and have that slice grab a contiguous slice in the *middle* of the list with orientation of lower index to greater index. It's possible in bioinformatics. Many organisms have circular chromosomes. It's a single DNA molecule spliced into a loop. There's an origin, but it's more a convenience thing for people to assign some particular base-pair to be location 0. From the organism's point of view, the origin isn't anything special (but there *is* a fixed orientation). It's entirely plausible for somebody to want to extract the sub-sequence from 100 bp (base-pairs) before the origin to 100 bp after the origin. If you were storing the sequence in Python string (or list), the most convenient way to express this would be seq[-100:100]. Likewise, if you wanted the *other* fragment, you would write seq[100:-100]. There is a minor confounding factor here in that biologists number sequences starting with 1, not 0. At least that was the way when I was doing this stuff mumble years ago. I don't know what the current convention is. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Oct 29, 2012 7:10 AM, Andrew Robinson andr...@r3dsolutions.com wrote: I will be porting Python 3.xx to a super low power embedded processor (MSP430), both space and speed are at a premium. Running Python on top of Java would be a *SERIOUS* mistake. .NET won't even run on this system. etc. If that's the case, then running Python at all is probably a mistake. You know the interpreter alone has an overhead of nearly 6 MB? Yes, I realize that. But, why can't I just overload the existing __getitem__ for lists and not bother writing an entire class? You can just overload that one method in a subclass of list. Being able to monkey-patch __getitem__ for the list class itself would not be advisable, as it would affect all list slicing anywhere in your program and possibly lead to some unexpected behaviors. Hmmm.. Let's try your example exactly as shown... hello world[aslice] Traceback (most recent call last): File stdin, line 1, in module NameError: name 'aslice' is not defined WOW. Cool. Where did the blanks *actually* get filled in? Or HOW WILL they in your next post? It appears that Steven omitted the definition of aslice by mistake. It looks like it should have been: aslice = slice(4, None, 2) Looking at some of the online programming notes -- a slice apparently doesn't use an integer storage variable that is capable of arbitrary expansion. =-O -- and hence, won't work for very large sized lists. That actually explains some crashes I have noted in the past when working with 20 million element lists that I wanted a slice of. I had *plenty* of ram on that system. 20 million is nothing. On a 32-bit system, sys.maxsize == 2 ** 31 - 1. If the error you were seeing was MemoryError, then more likely you were running into dynamic allocation issues due to fragmentation of virtual memory. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Mon, Oct 29, 2012 at 1:54 AM, Andrew andrewr3m...@gmail.com wrote: My intended inferences about the iterator vs. slice question was perhaps not obvious to you; Notice: an iterator is not *allowed* in __getitem__(). Yes, I misconstrued your question. I thought you wanted to change the behavior of slicing to wrap around the end when start stop instead of returning an empty sequence. What you actually want is a new sequence built from indexes supplied by an iterable. Chris has already given you a list comprehension solution to solve that. You could also use map for this: new_seq = list(map(old_seq.__getitem__, iterable)) Since you seem to be concerned about performance, I'm not sure in this case whether the map or the list comprehension will be faster. I'll leave you to test that on your intended hardware. In 'C', where Python is written, circularly linked lists -- and arrays are both very efficient ways of accessing data. Arrays can, in fact, have negative indexes -- perhaps contrary to what you thought. One merely defines a variable to act as the base pointer to the array and initialize it to the *end* of the array. Nor is the size of the data elements an issue, since in Python all classes are accessed by pointers which are of uniform size. I routinely do this in C. I'm aware of what is possible in C with pointer arithmetic. This is Python, though, and Python by design has neither pointers nor pointer arithmetic. In any case, initializing the pointer to the end of the array would still not do what you want, since the positive indices would then extend past the end of the array. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Mon, 29 Oct 2012 23:40:53 +1100, Chris Angelico wrote: On Mon, Oct 29, 2012 at 3:52 PM, Andrew Robinson andr...@r3dsolutions.com wrote: I am curious as to how quickly it constructs the result compared to a slice operation. Eg: a[1:5] vs. [ a[i] for i in xrange[1:5] ] For the most part, don't concern yourself with performance. Go with functionality and readability. In the trivial case shown here, the slice is WAY clearer, so it should definitely be the one used; in other cases, the slice might simply be insufficient, so you go with whatever achieves your goal. Performance will usually be good enough, even if there's a marginally faster way. Slicing is about an order of magnitude faster: [steve@ando ~]$ python2.7 -m timeit -s x = range(100, 1000, 2) x [20:40] 100 loops, best of 3: 0.342 usec per loop [steve@ando ~]$ python2.7 -m timeit -s x = range(100, 1000, 2) [x[i] for i in xrange(20, 40)] 10 loops, best of 3: 3.43 usec per loop -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Mon, 29 Oct 2012 11:19:38 +, Steven D'Aprano wrote: Because xrange represents a concrete sequence of numbers, all three of start, end and stride must be concrete, known, integers: py xrange(4, None, 2) Traceback (most recent call last): File stdin, line 1, in module TypeError: an integer is required Whereas slices can trivially include blanks that get filled in only when actually used: py hello world[aslice] 'owrd' py NOBODY expects the Spanish Inquisition![aslice] 'D xet h pns nusto!' /me facepalms/ Argggh, I forgot to copy-and-paste the critical line defining aslice: aslice = slice(4, None, 2) Sorry about that. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On 10/29/2012 06:52 AM, Roy Smith wrote: Show me an example where someone would write a slice with a negative and a positive index (both in the same slice); and have that slice grab a contiguous slice in the *middle* of the list with orientation of lower index to greater index. It's possible in bioinformatics. ... eq[100:-100]. I decided to go to bed... I was starting to write very badly worded responses. :) Thanks, Roy, what you have just shown is another example that agrees with what I am trying to do. FYI: I was asking for a reason why Python's present implementation is desirable... I wonder, for example: Given an arbitrary list: a=[1,2,3,4,5,6,7,8,9,10,11,12] Why would someone *want* to do: a[-7,10] Instead of saying a[5:10] or a[-7:-2] ? eg: What algorithm would naturally *desire* the default behavior of slicing when using *mixed* negative and positive indexes? In the case of a bacterial circular DNA/RNA ring, asking for codons[ -10: 10 ] would logically desire codon[-10:] + codon[:10] not an empty list, right? I think your example is a very reasonable thing the scientific community would want to do with Python. :) -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On 10/29/2012 10:09 AM, Ian Kelly wrote: On Oct 29, 2012 7:10 AM, Andrew Robinsonandr...@r3dsolutions.com wrote: I will be porting Python 3.xx to a super low power embedded processor (MSP430), both space and speed are at a premium. Running Python on top of Java would be a *SERIOUS* mistake. .NET won't even run on this system. etc. If that's the case, then running Python at all is probably a mistake. You know the interpreter alone has an overhead of nearly 6 MB? There's already a version of the python interpreter which fits in under 100K: http://code.google.com/p/python-on-a-chip/ It's not the 3.x series, though; and I don't want to redo this once 2.7 really does become obsolete. Yes, I realize that. But, why can't I just overload the existing __getitem__ for lists and not bother writing an entire class? You can just overload that one method in a subclass of list. Being able to monkey-patch __getitem__ for the list class itself would not be advisable, as it would affect all list slicing anywhere in your program and possibly lead to some unexpected behaviors. That's what I am curious about. What unexpected behaviors would a monkey patch typically cause? If no one really uses negative and positive indexes in the same slice operation, because there is no reason to do so... It will only break the occasional esoteric application. 20 million is nothing. On a 32-bit system, sys.maxsize == 2 ** 31 - 1. If the error you were seeing was MemoryError, then more likely you were running into dynamic allocation issues due to fragmentation of virtual memory. No, there was no error at all. Pthon just crashed exited; not even an exception that I can recall. It was if it exited normally! The list was generated in a single pass by many .append() 's, and then copied once -- the original was left in place; and then I attempted to slice it. I am able to routinely to 5 million length lists, copy, slice, cut, append, and delete from them without this ever happening. If fragmentation were the issue, I'd think the shorter lists would cause the problem after many manipulations... It may not be a bug in python itself, though, of course. There are libraries it uses which might have a bug. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Tue, Oct 30, 2012 at 2:42 AM, Andrew Robinson andr...@r3dsolutions.com wrote: No, there was no error at all. Pthon just crashed exited; not even an exception that I can recall. It was if it exited normally! Can you create a reproducible test case? There's usually a cause to these sorts of things. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Mon, Oct 29, 2012 at 9:20 AM, Andrew Robinson andr...@r3dsolutions.com wrote: FYI: I was asking for a reason why Python's present implementation is desirable... I wonder, for example: Given an arbitrary list: a=[1,2,3,4,5,6,7,8,9,10,11,12] Why would someone *want* to do: a[-7,10] Instead of saying a[5:10] or a[-7:-2] ? A quick search of local code turns up examples like this: if name.startswith('{') and name.endswith('}'): name = name[1:-1] If slices worked like ranges, then the result of that would be empty, which is obviously not desirable. I don't know of a reason why one might need to use a negative start with a positive stop, though. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Mon, Oct 29, 2012 at 9:42 AM, Andrew Robinson andr...@r3dsolutions.com wrote: The list was generated in a single pass by many .append() 's, and then copied once -- the original was left in place; and then I attempted to slice it. Note that if the list was generated by .appends, then it was copied more than once. Python reserves a specific amount of space for the list. When it grows past that, the list must be reallocated and copied. It grows the list exponentially in order to keep the amortized time complexity of append at O(1), but the point is that a list of 20 million items is going to be resized and copied several times before it is complete. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
In article mailman.3056.1351552107.27098.python-l...@python.org, Ian Kelly ian.g.ke...@gmail.com wrote: On Mon, Oct 29, 2012 at 9:42 AM, Andrew Robinson andr...@r3dsolutions.com wrote: The list was generated in a single pass by many .append() 's, and then copied once -- the original was left in place; and then I attempted to slice it. Note that if the list was generated by .appends, then it was copied more than once. Python reserves a specific amount of space for the list. When it grows past that, the list must be reallocated and copied. It grows the list exponentially in order to keep the amortized time complexity of append at O(1), but the point is that a list of 20 million items is going to be resized and copied several times before it is complete. I think you're missing the point of amortized constant time. Yes, the first item appended to the list will be copied lg(20,000,000) ~= 25 times, because the list will be resized that many times(*). But, on average (I'm not sure if average is strictly the correct word here), each item will be copied only once. Still, it always stuck me as odd that there's no preallocate() method. There are times when you really do know how many items you're going to add to the list, and doing a single allocation would be a win. And it doesn't cost anything to provide it. I suppose, however, if you're adding enough items that preallocating would really matter, then maybe you want an array instead. (*) I don't know the exact implementation; I'm assuming each resize is a factor of 2. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Mon, Oct 29, 2012 at 5:24 PM, Roy Smith r...@panix.com wrote: I think you're missing the point of amortized constant time. Yes, the first item appended to the list will be copied lg(20,000,000) ~= 25 times, because the list will be resized that many times(*). But, on average (I'm not sure if average is strictly the correct word here), each item will be copied only once. Still, it always stuck me as odd that there's no preallocate() method. There are times when you really do know how many items you're going to add to the list, and doing a single allocation would be a win. And it doesn't cost anything to provide it. I suppose, however, if you're adding enough items that preallocating would really matter, then maybe you want an array instead. (*) I don't know the exact implementation; I'm assuming each resize is a factor of 2. The growth factor is approximately 1.125. Approximately because there is also a small constant term. The average number of copies per item converges on 8. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On 29 October 2012 23:01, Ian Kelly ian.g.ke...@gmail.com wrote: On Mon, Oct 29, 2012 at 9:20 AM, Andrew Robinson andr...@r3dsolutions.com wrote: FYI: I was asking for a reason why Python's present implementation is desirable... I wonder, for example: Given an arbitrary list: a=[1,2,3,4,5,6,7,8,9,10,11,12] Why would someone *want* to do: a[-7,10] Instead of saying a[5:10] or a[-7:-2] ? A quick search of local code turns up examples like this: if name.startswith('{') and name.endswith('}'): name = name[1:-1] If slices worked like ranges, then the result of that would be empty, which is obviously not desirable. I don't know of a reason why one might need to use a negative start with a positive stop, though. It's useful when getting a reversed slice: a = [1,2,3,4,5,6,7,8,9,10] a[-3:3:-1] [8, 7, 6, 5] Oscar -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Mon, Oct 29, 2012 at 5:43 PM, Ian Kelly ian.g.ke...@gmail.com wrote: The growth factor is approximately 1.125. Approximately because there is also a small constant term. The average number of copies per item converges on 8. Of course, that is the *maximum* number of copies. The actual number could be much less if realloc() performs well. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Mon, 29 Oct 2012 08:42:39 -0700, Andrew Robinson wrote: But, why can't I just overload the existing __getitem__ for lists and not bother writing an entire class? You say that as if writing an entire class was a big complicated effort. It isn't. It is trivially simple, a single line: class MyList(list): ... plus the __getitem__ definition, which you would have to write anyway. It is a trivial amount of extra effort. You can just overload that one method in a subclass of list. Being able to monkey-patch __getitem__ for the list class itself would not be advisable, as it would affect all list slicing anywhere in your program and possibly lead to some unexpected behaviors. That's what I am curious about. What unexpected behaviors would a monkey patch typically cause? What part of unexpected is unclear? Monkey-patching is poor programming technique because it leads to *unexpected* and *impossible to predict* interactions between *distant* parts of the code. It leads to impossible to predict differences between the source code on disk and the actual running code. It leads to impossible to predict differences between documented behaviour and actual behaviour. Let me see if I can illustrate a flavour of the sort of things that can happen if monkey-patching built-ins were allowed. You create a list and print it: # simulated output py x = [5, 2, 4, 1] py print(x) [1, 2, 4, 5] What? How did that happen? That's not the list you provided. The order has been lost. So you dig deep into your code, and you don't find anything. And you read the Python documentation for lists, and don't find anything. And you google the Internet, and don't find anything. And you ask for help, and everybody says you're crazy because when they duplicate your code they get the expected behaviour. And you report a bug in Python, and it gets closed as cannot replicate. Finally you search deep into the libraries used in your code, and *five days later* discover that your code uses library A which uses library B which uses library C which uses library D which installs a harmless monkey-patch to print, but only if library E is installed, and you just happen to have E installed even though your code never uses it, AND that monkey-patch clashes with a harmless monkey-patch to list.__getitem__ installed by library F. And even though each monkey-patch alone is harmless, the combination breaks your code's output. Python allows, but does not encourage, monkey-patching of code written in pure Python, because it sometimes can be useful. It flat out prohibits monkey-patching of builtins, because it is just too dangerous. Ruby allows monkey-patching of everything. And the result was predictable: http://devblog.avdi.org/2008/02/23/why-monkeypatching-is-destroying-ruby/ -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
In article mailman.3057.1351554215.27098.python-l...@python.org, Ian Kelly ian.g.ke...@gmail.com wrote: On Mon, Oct 29, 2012 at 5:24 PM, Roy Smith r...@panix.com wrote: I think you're missing the point of amortized constant time. Yes, the first item appended to the list will be copied lg(20,000,000) ~= 25 times, because the list will be resized that many times(*). But, on average (I'm not sure if average is strictly the correct word here), each item will be copied only once. Still, it always stuck me as odd that there's no preallocate() method. There are times when you really do know how many items you're going to add to the list, and doing a single allocation would be a win. And it doesn't cost anything to provide it. I suppose, however, if you're adding enough items that preallocating would really matter, then maybe you want an array instead. (*) I don't know the exact implementation; I'm assuming each resize is a factor of 2. The growth factor is approximately 1.125. Approximately because there is also a small constant term. The average number of copies per item converges on 8. Wow, that's surprising. It also makes it that much more surprising that there's no way to pre-allocate. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On 10/29/2012 06:53 AM, Chris Angelico wrote: Can you provide links to these notes? I'm looking at cpython/Include/sliceobject.h that has this comment: /* A slice object containing start, stop, and step data members (the names are from range). After much talk with Guido, it was decided to let these be any arbitrary python type. Py_None stands for omitted values. */ Also, the code for slice objects in CPython works with Py_ssize_t (a signed quantity of the same length as size_t), which will allow at least 2**31 for an index. I would guess that your crashes were nothing to do with 20 million elements and slices. ChrisA Let's look at the source code rather than the web notes -- the source must be the true answer anyhow. I downloaded the source code for python 3.3.0, as the tbz; In the directory Python-3.3.0/Python, look at Python-ast.c, line 2089 ff. Clearly a slice is malloced for a slice_ty type. It has four elements: kind, lower, upper, and step. So, tracing it back to the struct definition... Include/Python-ast.h has typedef struct _slice *slice_ty; And, here's the answer!: enum _slice_kind {Slice_kind=1, ExtSlice_kind=2, Index_kind=3}; struct _slice { enum _slice_kind kind; union { struct { expr_ty lower; expr_ty upper; expr_ty step; } Slice; struct { asdl_seq *dims; } ExtSlice; struct { expr_ty value; } Index; } v; }; So, slice() does indeed have arbitrary python types included in it; contrary to what I read elsewhere. expr_ty is a pointer to an arbitrary expression, so the actual structure is 4 pointers, at 32 bits each = 16 bytes. The size of the structure itself, given in an earlier post, is 20 bytes -- which means one more pointer is involved, perhaps the one pointing to the slice structure itself. Hmm...! An empty tuple gives sys.getsizeof( () ) = 24. But, I would expect a tuple to be merely a list of object pointers; hence I would expect 4 bytes for len(), and then a head pointer 4 bytes, and then a pointer for each object. 3 objects gives 12 bytes, + 8 = 16 bytes. Then we need one more pointer so Python knows where the struct is... So a Tuple of 3 objects ought to fit nicely into 20 bytes; the same size as slice() -- but it's 24, even when empty... And 36 when initialized... What are the extra 16 bytes for? All I see is: typedef struct { object** whatever } PyTupleObject; -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Mon, Oct 29, 2012 at 11:00 AM, Andrew Robinson andr...@r3dsolutions.com wrote: Let's look at the source code rather than the web notes -- the source must be the true answer anyhow. I downloaded the source code for python 3.3.0, as the tbz; In the directory Python-3.3.0/Python, look at Python-ast.c, line 2089 ff. Clearly a slice is malloced for a slice_ty type. It has four elements: kind, lower, upper, and step. So, tracing it back to the struct definition... Include/Python-ast.h has typedef struct _slice *slice_ty; And, here's the answer!: enum _slice_kind {Slice_kind=1, ExtSlice_kind=2, Index_kind=3}; struct _slice { enum _slice_kind kind; union { struct { expr_ty lower; expr_ty upper; expr_ty step; } Slice; struct { asdl_seq *dims; } ExtSlice; struct { expr_ty value; } Index; } v; }; So, slice() does indeed have arbitrary python types included in it; contrary to what I read elsewhere. expr_ty is a pointer to an arbitrary expression, so the actual structure is 4 pointers, at 32 bits each = 16 bytes. The size of the structure itself, given in an earlier post, is 20 bytes -- which means one more pointer is involved, perhaps the one pointing to the slice structure itself. Hmm...! An empty tuple gives sys.getsizeof( () ) = 24. But, I would expect a tuple to be merely a list of object pointers; hence I would expect 4 bytes for len(), and then a head pointer 4 bytes, and then a pointer for each object. 3 objects gives 12 bytes, + 8 = 16 bytes. Then we need one more pointer so Python knows where the struct is... So a Tuple of 3 objects ought to fit nicely into 20 bytes; the same size as slice() -- but it's 24, even when empty... And 36 when initialized... What are the extra 16 bytes for? Every Python object requires two pieces of data, both of which are pointer-sized (one is a pointer, one is an int the size of a pointer). These are: a pointer to the object's type, and the object's reference count. A tuple actually does not need a head pointer: the head pointer is merely an offset from the tuple's pointer. It merely has a ref count, type, an item count, and pointers to its contents. A slice has the same type pointer and reference count, then three pointers to the start, stop, and step objects. This means a slice object should be the same size as a two-item tuple: the tuple needs a count, while that is fixed at 3 for a slice (though some items may be unset). NOTE: The above is taken from reading the source code for Python 2.6. For some odd reason, I am getting that an empty tuple consists of 6 pointer-sized objects (48 bytes on x64), rather than the expected 3 pointer-sized (24 bytes on x64). Slices are showing up as the expected 5 pointer-sized (40 bytes on x64), and tuples grow at the expected 1 pointer (8 bytes on x64) per item. I imagine I am missing something, but cannot figure out what that would be. All I see is: typedef struct { object** whatever } PyTupleObject; -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On 10/29/2012 05:02 PM, Steven D'Aprano wrote: On Mon, 29 Oct 2012 08:42:39 -0700, Andrew Robinson wrote: But, why can't I just overload the existing __getitem__ for lists and not bother writing an entire class? You say that as if writing an entire class was a big complicated effort. It isn't. It is trivially simple, a single line: class MyList(list): ... No, I don't think it big and complicated. I do think it has timing implications which are undesirable because of how *much* slices are used. In an embedded target -- I have to optimize; and I will have to reject certain parts of Python to make it fit and run fast enough to be useful. You can just overload that one method in a subclass of list. Being able to monkey-patch __getitem__ for the list class itself would not be advisable, as it would affect all list slicing anywhere in your program and possibly lead to some unexpected behaviors. That's what I am curious about. What unexpected behaviors would a monkey patch typically cause? What part of unexpected is unclear? Ahh -- The I don't know approach! It's only unexpected if one is a bad programmer...! Let me see if I can illustrate a flavour of the sort of things that can happen if monkey-patching built-ins were allowed. You create a list and print it: # simulated output py x = [5, 2, 4, 1] py print(x) [1, 2, 4, 5] snip Finally you search deep into the libraries used in your code, and *five days later* discover that your code uses library A which uses library B which uses library C which uses library D which installs a harmless monkey-patch to print, but only if library E is installed, and you just happen to have E installed even though your code never uses it, AND that monkey-patch clashes with a harmless monkey-patch to list.__getitem__ installed by library F. And even though each monkey-patch alone is harmless, the combination breaks your code's output. Right, which means that people developing the libraries made contradictory assumptions. Python allows, but does not encourage, monkey-patching of code written in pure Python, because it sometimes can be useful. It flat out prohibits monkey-patching of builtins, because it is just too dangerous. Ruby allows monkey-patching of everything. And the result was predictable: http://devblog.avdi.org/2008/02/23/why-monkeypatching-is-destroying-ruby/ I read that post carefully; and the author purposely notes that he is exaggerating. BUT Your point is still well taken. What you are talking about is namespace preservation; and I am thinking about it. I can preserve it -- but only if I disallow true Python primitives in my own interpreter; I can't provide two sets in the memory footprint I am using. From my perspective, the version of Python that I compile will not be supported by the normal python help; The predecessor which first forged this path, Pymite, has the same problems -- however, the benefits ought-weigh the disadvantages; and the experiment yielded useful information on what is redundant in Python (eg: range is not supported) and when that redundancy is important for some reason. If someone had a clear explanation of the disadvantages of allowing an iterator, or a tuple -- in place of a slice() -- I would have no qualms dropping the subject. However, I am not finding that yet. I am finding very small optimization issues... The size of an object is at least 8 bytes. Hence, three numbers is going to be at least 24 bytes; and that's 24 bytes in *excess* of the size of slice() or tuple () which are merely containers. So -- There *ARE* savings in memory when using slice(), but it isn't really 2x memory -- its more like 20% -- once the actual objects are considered. The actual *need* for a slice() object still hasn't been demonsrated. I am thinking that the implementation of __getitem__() is very poor probably because of legacy issues. A tuple can also hold None, so ( 1, None, 2 ) is still a valid Tuple. Alternately: An iterator, like xrange(), could be made which takes None as a parameter, or a special value like 'inf'. Since these two values would never be passed to xrange by already developed code, allowing them would not break working code. I am only aware of one possible reason that slice() was once thought to be necessary; and that is because accessing the element of a tuple would recursively call __getitem__ on the tuple. But, even that is easily dismissed once the fixed integer indexes are considered. Your thoughts? Do you have any show stopper insights? -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On 10/29/2012 06:49 PM, Chris Kaynor wrote: Every Python object requires two pieces of data, both of which are pointer-sized (one is a pointer, one is an int the size of a pointer). These are: a pointer to the object's type, and the object's reference count. A tuple actually does not need a head pointer: the head pointer is merely an offset from the tuple's pointer. It merely has a ref count, type, an item count, and pointers to its contents. A slice has the same type pointer and reference count, then three pointers to the start, stop, and step objects. This means a slice object should be the same size as a two-item tuple: the tuple needs a count, while that is fixed at 3 for a slice (though some items may be unset). NOTE: The above is taken from reading the source code for Python 2.6. For some odd reason, I am getting that an empty tuple consists of 6 pointer-sized objects (48 bytes on x64), rather than the expected 3 pointer-sized (24 bytes on x64). Slices are showing up as the expected 5 pointer-sized (40 bytes on x64), and tuples grow at the expected 1 pointer (8 bytes on x64) per item. I imagine I am missing something, but cannot figure out what that would be. All I see is: typedef struct { object** whatever } PyTupleObject; It's fairly straight forward in 3.2.0. I debugged the code with GDB and watched. Perhaps it is the same in 2.6 ? In addition to those items you mention, of which the reference count is not even *inside* the struct -- there is additional debugging information not mentioned. Built in objects contain a line number, a column number, and a context pointer. These each require a full word of storage. Also, built in types appear to have a kind field which indicates the object type but is not a pointer. That suggests two object type indicators, a generic pointer (probably pointing to builtin? somewhere outside the struct) and a specific one (an enum) inside the C struct. Inside the tuple struct, I count 4 undocumented words of information. Over all, there is a length, the list of pointers, a kind, line, col and context; making 6 pieces in total. Although your comment says the head pointer is not required; I found in 3.3.0 that it is a true head pointer; The Tuple() function on line 2069 of Python-ast.c, (3.3 version) -- is passed in a pointer called *elts. That pointer is copied into the Tuple struct. How ironic, slices don't have debugging info, that's the main reason they are smaller. When I do slice(3,0,2), suprisingly Slice() is NOT called. But when I do a[1:2:3] it *IS* called. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On 10/29/2012 01:34 PM, Andrew Robinson wrote: No, I don't think it big and complicated. I do think it has timing implications which are undesirable because of how *much* slices are used. In an embedded target -- I have to optimize; and I will have to reject certain parts of Python to make it fit and run fast enough to be useful. Since you can't port the full Python system to your embedded machine anyway, why not just port a subset of python and modify it to suit your needs right there in the C code. It would be a fork, yes, but any port to this target will be a fork anyway. No matter how you cut it, it won't be easy at all, and won't be easy to maintain. You'll basically be writing your own implementation of Python (that's what python-on-a-chip is, and that's why it's closer to Python 2.x than Python 3). That's totally fine, though. I get the impression you think you will be able to port cPython as is to your target. Without a libc, an MMU on the CPU, and a kernel, it's not going to just compile and run. Anyway, the only solution, given your constraints, is to implement your own python interpreter to handle a subset of Python, and modify it to suit your tastes. What you want with slicing behavior changes has no place in the normal cPython implementation, for a lot of reasons. The main one is that it is already possible to implement what you are talking about in your own python class, which is a fine solution for a normal computer with memory and CPU power available. -- http://mail.python.org/mailman/listinfo/python-list
Negative array indicies and slice()
The slice operator does not give any way (I can find!) to take slices from negative to positive indexes, although the range is not empty, nor the expected indexes out of range that I am supplying. Many programs that I write would require introducing variables and logical statements to correct the problem which is very lengthy and error prone unless there is a simple work around. I *hate* replicating code every time I need to do this! I also don't understand why slice() is not equivalent to an iterator, but can replace an integer in __getitem__() whereas xrange() can't. Here's an example for Linux shell, otherwise remove /bin/env... {{{#!/bin/env python a=[1,2,3,4,5,6,7,8,9,10] print a[-4:3] # I am interested in getting [7,8,9,10,1,2] but I get []. }}} -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Sun, Oct 28, 2012 at 9:12 PM, andrewr3m...@gmail.com wrote: The slice operator does not give any way (I can find!) to take slices from negative to positive indexes, although the range is not empty, nor the expected indexes out of range that I am supplying. Many programs that I write would require introducing variables and logical statements to correct the problem which is very lengthy and error prone unless there is a simple work around. I *hate* replicating code every time I need to do this! I also don't understand why slice() is not equivalent to an iterator, but can replace an integer in __getitem__() whereas xrange() can't. Here's an example for Linux shell, otherwise remove /bin/env... {{{#!/bin/env python a=[1,2,3,4,5,6,7,8,9,10] print a[-4:3] # I am interested in getting [7,8,9,10,1,2] but I get []. }}} For a sequence of length 10, a[-4:3] is equivalent to a[6:3], which is an empty slice since index 6 is after index 3. If you want it to wrap around, then take two slices and concatenate them with a[-4:] + a[:3]. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On 2012-10-29 03:12, andrewr3m...@gmail.com wrote: The slice operator does not give any way (I can find!) to take slices from negative to positive indexes, although the range is not empty, nor the expected indexes out of range that I am supplying. Many programs that I write would require introducing variables and logical statements to correct the problem which is very lengthy and error prone unless there is a simple work around. I *hate* replicating code every time I need to do this! I also don't understand why slice() is not equivalent to an iterator, but can replace an integer in __getitem__() whereas xrange() can't. Here's an example for Linux shell, otherwise remove /bin/env... {{{#!/bin/env python a=[1,2,3,4,5,6,7,8,9,10] print a[-4:3] # I am interested in getting [7,8,9,10,1,2] but I get []. }}} If the stride is positive (if omitted it defaults to 1), the slice is from the start index to one before the end index, and a negative index counts from the end. a[-4:3] is equivalent to a[len(a)-4:3], which is an empty list if len(a)-4 = 3. It doesn't wrap around. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Sunday, October 28, 2012 8:43:30 PM UTC-7, Ian wrote: On Sun, Oct 28, 2012 at 9:12 PM, andrew wrote: The slice operator does not give any way (I can find!) to take slices from negative to positive indexes, although the range is not empty, nor the expected indexes out of range that I am supplying. Many programs that I write would require introducing variables and logical statements to correct the problem which is very lengthy and error prone unless there is a simple work around. I *hate* replicating code every time I need to do this! I also don't understand why slice() is not equivalent to an iterator, but can replace an integer in __getitem__() whereas xrange() can't. Here's an example for Linux shell, otherwise remove /bin/env... {{{#!/bin/env python a=[1,2,3,4,5,6,7,8,9,10] print a[-4:3] # I am interested in getting [7,8,9,10,1,2] but I get []. }}} For a sequence of length 10, a[-4:3] is equivalent to a[6:3], which is an empty slice since index 6 is after index 3. If you want it to wrap around, then take two slices and concatenate them with a[-4:] + a[:3]. Hi Ian, Well, no it really isn't equivalent. Consider a programmer who writes: xrange(-4,3) *wants* [-4,-3,-2,-1,0,1,2] That is the idea of a range; for what reason would anyone *EVER* want -4 to +3 to be 6:3??? I do agree that the data held in -4 is equivalent to the data in 6, but the index is not the same. So: Why does python choose to convert them to positive indexes, and have slice operate differently than xrange -- for the slice() object can't possibly know the size of the array when it is passed in to __getitem__; They are totally separate classes. I realize I can concat. two slice ranges, BUT, the ranges do not always span from negative to positive. eg: a line in my program reads: a[x-5:x] if x is 7, then this is a positive index to a positive index. So, there is no logic to using two slices concatd ! I use this arbitrary range code *often* so I need a general purpose solution. I looked up slice() but the help is of no use, I don't even know how I might overload it to embed some logic to concatenate ranges of data; nor even if it is possible. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Sunday, October 28, 2012 8:43:30 PM UTC-7, Ian wrote: On Sun, Oct 28, 2012 at 9:12 PM, Andrew wrote: The slice operator does not give any way (I can find!) to take slices from negative to positive indexes, although the range is not empty, nor the expected indexes out of range that I am supplying. Many programs that I write would require introducing variables and logical statements to correct the problem which is very lengthy and error prone unless there is a simple work around. I *hate* replicating code every time I need to do this! I also don't understand why slice() is not equivalent to an iterator, but can replace an integer in __getitem__() whereas xrange() can't. Here's an example for Linux shell, otherwise remove /bin/env... {{{#!/bin/env python a=[1,2,3,4,5,6,7,8,9,10] print a[-4:3] # I am interested in getting [7,8,9,10,1,2] but I get []. }}} For a sequence of length 10, a[-4:3] is equivalent to a[6:3], which is an empty slice since index 6 is after index 3. If you want it to wrap around, then take two slices and concatenate them with a[-4:] + a[:3]. Hi Ian, Well, no it really isn't equivalent; although Python implements it as equivalent. Consider a programmer who writes: xrange(-4,3) They clearly *want* [-4,-3,-2,-1,0,1,2] That is the idea of a range; So, for what reason would anyone want -4 to +3 to be 6:3??? Can you show me some code where this is desirable?? I do agree that the data held in -4 is equivalent to the data in 6, but the index is not the same. So: Why does python choose to convert them to positive indexes, and have slice operate differently than xrange -- for the slice() object can't possibly know the size of the array when it is passed in to __getitem__; They are totally separate classes. I realize I can concat. two slice ranges, BUT, the ranges do not always span from negative to positive. eg: a line in my program reads: a[x-5:x] if x is 7, then this is a positive index to a positive index. So, there is no logic to using two slices concatd ! I use this arbitrary range code *often* so I need a general purpose solution. I looked up slice() but the help is of no use, I don't even know how I might overload it to embed some logic to concatenate ranges of data; nor even if it is possible. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Sun, Oct 28, 2012 at 10:00 PM, andrewr3m...@gmail.com wrote: Hi Ian, Well, no it really isn't equivalent. Consider a programmer who writes: xrange(-4,3) *wants* [-4,-3,-2,-1,0,1,2] That is the idea of a range; for what reason would anyone *EVER* want -4 to +3 to be 6:3??? That is what ranges do, but your question was about slices, not ranges. So: Why does python choose to convert them to positive indexes, and have slice operate differently than xrange -- for the slice() object can't possibly know the size of the array when it is passed in to __getitem__; They are totally separate classes. Ranges can contain negative integers. However, sequences do not have negative indices. Therefore, negative indices in slices are used to count from the end instead of from the start. As stated in the language docs, If either bound is negative, the sequence’s length is added to it. Therefore, a[-4:3] does not wrap around the end of the sequence because a[6:3] does not wrap around the end of the sequence. I realize I can concat. two slice ranges, BUT, the ranges do not always span from negative to positive. def wrapping_slice(seq, start, stop): start, stop, _ = slice(start, stop).indices(len(seq)) if start = stop: return seq[start:stop] else: return seq[start:] + seq[:stop] You'll have to decide for yourself whether you want it to return an empty list or the entire list if start == stop. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
On Oct 29, 2:09 pm, Andrew andrewr3m...@gmail.com wrote: I use this arbitrary range code *often* so I need a general purpose solution. I looked up slice() but the help is of no use, I don't even know how I might overload it to embed some logic to concatenate ranges of data; nor even if it is possible. Slices are passed in if provided to __getitem__/__setitem__/ __delitem__, so you'd need to override it at the list level: class RangedSlicer(list): def __getitem__(self, item): # map item.start, .stop and .step to your own semantics Then wrap your lists with your RangedSlicer class as needed. -- http://mail.python.org/mailman/listinfo/python-list
Re: Negative array indicies and slice()
Andrew andrewr3m...@gmail.com writes: So: Why does python choose to convert them to positive indexes, and have slice operate differently than xrange There was a thread a few years back, I think started by Bryan Olson, that made the case that slice indexing is a Python wart for further reasons than the above, and suggesting a notation like x[$-5] to denote what we now call x[-5] (i.e. $ is the length of the string). So your example x[$-4:3] would clearly be the same thing as x[6:3] and not give any suggestion that it might wrap around. -- http://mail.python.org/mailman/listinfo/python-list