Re: [Python-Dev] new buffer in python2.7

2010-11-08 Thread Lennart Regebro
On Wed, Oct 27, 2010 at 12:36, Antoine Pitrou  wrote:
> On Wed, 27 Oct 2010 10:13:12 +0800
> Kristján Valur Jónsson  wrote:
>> Although 2.7 has the new buffer interface and memoryview
>> objects, these are widely not accepted in the built in modules.
>
> That's true, and slightly unfortunate. It could be a reason for
> switching to 3.1/3.2 :-)

It's rather a reason against it, as it makes supporting both Python 2
and Python 3 harder.
However, fixing this in 2.7 just means that you need to support 2.7x
or later only, so it's not a good solution.
I think using compatibility types is a better solution. I suggested
something like that for inclusion in "six", but it was softly
rejected. :-)

Something like this, for example. It's a str in Python2 and a Bytes in
Python3, but it extends both classes with a consistent interface.
Improvements, comments and ideas are welcome.

bites.py:

import sys
if sys.version < '3':
class Bites(str):
def __new__(cls, value):
if isinstance(value[0], int):
# It's a list of integers
value = ''.join([chr(x) for x in value])
return super(Bites, cls).__new__(cls, value)

def itemint(self, index):
return ord(self[index])

def iterint(self):
for x in self:
yield ord(x)
else:

class Bites(bytes):
def __new__(cls, value):
if isinstance(value, str):
# It's a unicode string:
value = value.encode('ISO-8859-1')
return super(Bites, cls).__new__(cls, value)

def itemint(self, x):
return self[x]

def iterint(self):
for x in self:
yield x


-- 
Lennart Regebro: http://regebro.wordpress.com/
Python 3 Porting: http://python3porting.com/
+33 661 58 14 64
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] new buffer in python2.7

2010-11-01 Thread Nick Coghlan
2010/11/1 Kristján Valur Jónsson :
> Ah, yes.  There are, in my case.  (why do I always seem to be doing stuff 
> that is different from what you all are doing:)

I would guess that most of us aren't writing MMOs for a living. Gamers
seem to be a particularly demanding breed of user :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] new buffer in python2.7

2010-11-01 Thread Kristján Valur Jónsson
Ah, yes.  There are, in my case.  (why do I always seem to be doing stuff that 
is different from what you all are doing:)
The particular piece of code is from the chunked reader. It may be reading 
rather large chunks at a time (several lots of Kb.):

def recvchunk(socket):
len = socket.unpack('i', recv_exactly(socket, 4))
return recv_exactly(len)

#old style
def recv_exactly(socket, length):
data = []
while length:
got = socket.receive(length)
if not got: raise EOFError
data.append(got)
length -= len(got)
return "".join(data)

#new style
def recv_exactly(socket, length):
data = bytearray(length)
view = memoryview(data)
while length:
got = socket.receive_into(view[-length:])
if not got: raise EOFError
length -= len(got)
return data


Here I spot another optimzation oppertunity:  let memoryview[:] return self, 
since the object is immutable, I believe.
K

> -Original Message-
> From: "Martin v. Löwis" [mailto:mar...@v.loewis.de]
> Sent: 1. nóvember 2010 14:22
> To: Kristján Valur Jónsson
> Cc: python-dev@python.org
> Subject: Re: [Python-Dev] new buffer in python2.7
> 
> 
> Assuming there are multiple recv calls. For a typical struct, all data
> will come out of the stream with a single recv. so no join will be
> necessary.
> 
> Regards,
> Martin

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] new buffer in python2.7

2010-11-01 Thread Stefan Behnel

Stefan Behnel, 01.11.2010 09:45:

If slice object creation bothers you here, it might be worth using a
free list in PySlice_New() instead of creating new slice objects on
request.
[...]
You can take a look at how it's done in tupleoject.c if you want to
provide a patch.


Hmm, that's actually a particularly bad place to look. The implementation 
in listobject.c is much simpler.


Stefan

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] new buffer in python2.7

2010-11-01 Thread Kristján Valur Jónsson
I've already created a patch.  See http://bugs.python.org/issue10227.
I was working with 2.7 where slicing sequences is done differently than in 3.2, 
so the difference is not that very great.
I'm going to have another go at profiling the 3.2 version later and see why 
slicing a bytearray is so much more expensive than slicing a bytes object.
K

> -Original Message-
> From: python-dev-bounces+kristjan=ccpgames@python.org
> [mailto:python-dev-bounces+kristjan=ccpgames@python.org] On Behalf
> Of Stefan Behnel
> Sent: 1. nóvember 2010 16:45
> To: python-dev@python.org
> Subject: Re: [Python-Dev] new buffer in python2.7
> 
> Kristján Valur Jónsson, 27.10.2010 16:28:
> > Notice how  a Slice object is generated.  Then a PyObject_GetItem()
> is
> > done.  The salient code path is from apply_slice().  A slice object
> must
> > be constructed and destroyed.
> 
> If slice object creation bothers you here, it might be worth using a
> free
> list in PySlice_New() instead of creating new slice objects on request.
> 
> Creating a slice of something is not necessarily such a costly
> operation
> that it dominates creating the slice object, so optimising the slice
> request itself sounds like a good idea.
> 
> You can take a look at how it's done in tupleoject.c if you want to
> provide
> a patch. Then, please open a bug tracker ticket and attach the patch
> there
> (and post a link to the ticket in this thread).
> 
> Stefan
> 
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-
> dev/kristjan%40ccpgames.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] new buffer in python2.7

2010-11-01 Thread Stefan Behnel

Kristján Valur Jónsson, 27.10.2010 16:28:

Notice how  a Slice object is generated.  Then a PyObject_GetItem() is
done.  The salient code path is from apply_slice().  A slice object must
be constructed and destroyed.


If slice object creation bothers you here, it might be worth using a free 
list in PySlice_New() instead of creating new slice objects on request.


Creating a slice of something is not necessarily such a costly operation 
that it dominates creating the slice object, so optimising the slice 
request itself sounds like a good idea.


You can take a look at how it's done in tupleoject.c if you want to provide 
a patch. Then, please open a bug tracker ticket and attach the patch there 
(and post a link to the ticket in this thread).


Stefan

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] new buffer in python2.7

2010-11-01 Thread Stefan Behnel

Kristján Valur Jónsson, 27.10.2010 16:32:

Sorry, here the tables properly formatted:


Certainly looked better on your first try.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] new buffer in python2.7

2010-10-31 Thread Martin v. Löwis
>> def read_and_unpack(stream, format): 
>>   data = stream.read(struct.calcsize(format))
>>   return struct.unpack(format, data)
>> 
>>> Otherwise, I'm +1 on your suggestion, avoiding copying is a good
>>> thing.
>> 
>> I believe my function also doesn't involve any unnecessary copies.
> You just moved your copying down one level into stream.read(). This
> magic function must be implemented by possibly concatenating
> several "socket.recv()" calls.
> This invariably involves data copying, either by "".join() or
> stringio.write()

Assuming there are multiple recv calls. For a typical struct, all data
will come out of the stream with a single recv. so no join will be
necessary.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] new buffer in python2.7

2010-10-31 Thread Kristján Valur Jónsson
You just moved your copying down one level into stream.read().
This magic function must be implemented by possibly concatenating several 
"socket.recv()" calls.
This invariably involves data copying, either by "".join() or stringio.write()
K

-Original Message-
From: python-dev-bounces+kristjan=ccpgames@python.org 
[mailto:python-dev-bounces+kristjan=ccpgames@python.org] On Behalf Of 
"Martin v. Löwis"
Sent: Friday, October 29, 2010 18:15
To: python-dev@python.org
Subject: Re: [Python-Dev] new buffer in python2.7
That is easy to achieve using the existing API:

def read_and_unpack(stream, format):
data = stream.read(struct.calcsize(format))
return struct.unpack(format, data)

> Otherwise, I'm +1 on your suggestion, avoiding copying is a good thing.

I believe my function also doesn't involve any unnecessary copies.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] new buffer in python2.7

2010-10-29 Thread Martin v. Löwis
> Actually I would like code like
>   s = socket()
>   ...
>   header = struct.unpack("i", s)
> 
> In other words, struct should interact with files/streams directly, instead 
> of 
> requiring me to first read a chunk who's size I manually have to determine 
> etc.

That is easy to achieve using the existing API:

def read_and_unpack(stream, format):
data = stream.read(struct.calcsize(format))
return struct.unpack(format, data)

> Otherwise, I'm +1 on your suggestion, avoiding copying is a good thing.

I believe my function also doesn't involve any unnecessary copies.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] new buffer in python2.7

2010-10-27 Thread Kristján Valur Jónsson
Not cheap enough.  There is no reason why creating another memoryview shouldn't 
be as cheap as creating a string slice.

I'm investigating why.

K



-Original Message-
From: python-dev-bounces+kristjan=ccpgames@python.org 
[mailto:python-dev-bounces+kristjan=ccpgames@python.org] On Behalf Of 
Antoine Pitrou
Sent: Wednesday, October 27, 2010 20:15
To: python-dev@python.org
Subject: Re: [Python-Dev] new buffer in python2.7



On Wed, 27 Oct 2010 20:00:10 +0800

Kristján Valur Jónsson mailto:krist...@ccpgames.com>> 
wrote:

> Calling getbuffer on a bytearray or a bytes object should be really

> cheap, so I still don't accept this as a matter of fact situation.



It *is* cheap. It's just that copying a short slice is dirt cheap as well.



Of course, it you manipulate 1 KB slices or longer, it's different.



Regards



Antoine.

___

Python-Dev mailing list

Python-Dev@python.org<mailto:Python-Dev@python.org>

http://mail.python.org/mailman/listinfo/python-dev

Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/kristjan%40ccpgames.com


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] new buffer in python2.7

2010-10-27 Thread Kristján Valur Jónsson
Sorry, here the tables properly formatted:



Function Name


Inclusive Samples


Exclusive Samples


Inclusive Samples %


Exclusive Samples %


apply_slice


8.997


468


62,23


3,24


_PyObject_GetItem


6.257


400


43,28


2,77


memory_subscript


5.857


1.051


40,51


7,27


_PyMemoryView_FromBuffer


2.642


455


18,27


3,15


memory_dealloc


1.572


297


10,87


2,05


_PyObject_Malloc


1.374


1.374


9,50


9,50


__PyObject_GC_New


1.256


236


8,69


1,63


_PySlice_New


1.211


333


8,38


2,30


slice_dealloc


1.061


769


7,34


5,32


__PyObject_GC_Malloc


1.022


293


7,07


2,03


bytearray_getbuffer


987


354


6,83


2,45


dup_buffer


932


932


6,45


6,45




Function Name


Inclusive Samples


Exclusive Samples


Inclusive Samples %


Exclusive Samples %


apply_slice


3.888


502


48,44


6,25


_PySequence_GetSlice


3.039


350


37,86


4,36


string_slice


2.689


281


33,50


3,50


_PyString_FromStringAndSize


2.409


575


30,01


7,16


[MSVCR90.dll]


1.413


1.407


17,61


17,53


string_dealloc


467


150


5,82


1,87


_PyObject_Malloc


379


379


4,72


4,72


_PyObject_Free


378


378


4,71


4,71


__PyEval_SliceIndex


347


347


4,32


4,32








-Original Message-
From: python-dev-bounces+kristjan=ccpgames@python.org 
[mailto:python-dev-bounces+kristjan=ccpgames@python.org] On Behalf Of 
Kristján Valur Jónsson
Sent: Wednesday, October 27, 2010 20:00
To: Antoine Pitrou; python-dev@python.org
Subject: Re: [Python-Dev] new buffer in python2.7



Ah, well in 2.7 you don't have the luxury of a bytes object.  A str() would be 
similar in 2.7 Anyway, isn't the "bytes" object immutable? In that case, it is 
not a useful target for a sock.recv_into() call.

Calling getbuffer on a bytearray or a bytes object should be really cheap, so I 
still don't accept this as a matter of fact situation.



Btw, going to 3.2 isn't a real option for us any time soon.  A lot of companies 
probably find themselves in a similar situation.  This is why I spend so much 
effort applying some love to 2.7.  Most of the stuff I locally do to 2.7 I then 
contribute to python as 3.2 patches.  Someday they'll get backported, no doubt 
:)



K



-Original Message-

From: python-dev-bounces+kristjan=ccpgames@python.org 
[mailto:python-dev-bounces+kristjan=ccpgames@python.org] On Behalf Of 
Antoine Pitrou

Sent: Wednesday, October 27, 2010 19:16

To: python-dev@python.org

Subject: Re: [Python-Dev] new buffer in python2.7





> >Here are micro-benchmarks under 3.2:

>

> > $ ./python -m timeit -s "x = b'x'*1" "x[:100]"

> > 1000 loops, best of 3: 0.134 usec per loop $ ./python -m timeit

> > -s "x = memoryview(b'x'*1)" "x[:100]"

> > 1000 loops, best of 3: 0.151 usec per loop

>

> That's weird.  The greedy slice needs two memory allocations.  One for

> the ByteArray object itself, one for its cargo.  In total, more that

> 100 bytes.  In contrast, creating the MemoryView object requires only

> one allocation of a few dozen bytes.



It's not a bytearray object, it's a bytes object. It requires only a single 
allocation since the data is allocated inline.

The memoryview object must also call getbuffer() again on the original object.



Regards



Antoine.





___

Python-Dev mailing list

Python-Dev@python.org<mailto:Python-Dev@python.org>

http://mail.python.org/mailman/listinfo/python-dev

Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/kristjan%40ccpgames.com



___

Python-Dev mailing list

Python-Dev@python.org<mailto:Python-Dev@python.org>

http://mail.python.org/mailman/listinfo/python-dev

Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/kristjan%40ccpgames.com


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] new buffer in python2.7

2010-10-27 Thread Kristján Valur Jónsson
So, I did some profiling.  It turns out that the difference is not due to the 
creation of the MemoryView object as such, but the much more complicated slice 
handling for generic objects.
Here is the interesting part of the inclusive sample measurements for the 
MemoryView:
Function Name   Inclusive Samples   Exclusive Samples   Inclusive 
Samples % Exclusive Samples %
apply_slice 8.997   468 62,23   3,24
_PyObject_GetItem   6.257   400 43,28   2,77
memory_subscript5.857   1.051   40,51   7,27
_PyMemoryView_FromBuffer2.642   455 18,27   3,15
memory_dealloc  1.572   297 10,87   2,05
_PyObject_Malloc1.374   1.374   9,509,50
__PyObject_GC_New   1.256   236 8,691,63
_PySlice_New1.211   333 8,382,30
slice_dealloc   1.061   769 7,345,32
__PyObject_GC_Malloc1.022   293 7,072,03
bytearray_getbuffer 987 354 6,832,45
dup_buffer  932 932 6,456,45

Notice how  a Slice object is generated.  Then a PyObject_GetItem() is done.  
The salient code path is from apply_slice().  A slice object must be 
constructed and destroyed.

In contrast, when done on a string directly (or a bytes object in py3k) you go 
directly to PySequence_GetSlice.
Function Name   Inclusive Samples   Exclusive Samples   Inclusive 
Samples % Exclusive Samples %
apply_slice 3.888   502 48,44   6,25
_PySequence_GetSlice3.039   350 37,86   4,36
string_slice2.689   281 33,50   3,50
_PyString_FromStringAndSize 2.409   575 30,01   7,16
[MSVCR90.dll]   1.413   1.407   17,61   17,53
string_dealloc  467 150 5,821,87
_PyObject_Malloc379 379 4,724,72
_PyObject_Free  378 378 4,714,71
__PyEval_SliceIndex 347 347 4,324,32

(The measurements for the MemoryView above already contain some optimizations 
I've done on naïve functions).
This area is ripe for improvements.

K
-Original Message-
From: python-dev-bounces+kristjan=ccpgames@python.org 
[mailto:python-dev-bounces+kristjan=ccpgames@python.org] On Behalf Of 
Kristján Valur Jónsson
Sent: Wednesday, October 27, 2010 20:00
To: Antoine Pitrou; python-dev@python.org
Subject: Re: [Python-Dev] new buffer in python2.7

Ah, well in 2.7 you don't have the luxury of a bytes object.  A str() would be 
similar in 2.7 Anyway, isn't the "bytes" object immutable? In that case, it is 
not a useful target for a sock.recv_into() call.
Calling getbuffer on a bytearray or a bytes object should be really cheap, so I 
still don't accept this as a matter of fact situation.

Btw, going to 3.2 isn't a real option for us any time soon.  A lot of companies 
probably find themselves in a similar situation.  This is why I spend so much 
effort applying some love to 2.7.  Most of the stuff I locally do to 2.7 I then 
contribute to python as 3.2 patches.  Someday they'll get backported, no doubt 
:)

K

-Original Message-
From: python-dev-bounces+kristjan=ccpgames@python.org 
[mailto:python-dev-bounces+kristjan=ccpgames@python.org] On Behalf Of 
Antoine Pitrou
Sent: Wednesday, October 27, 2010 19:16
To: python-dev@python.org
Subject: Re: [Python-Dev] new buffer in python2.7


> >Here are micro-benchmarks under 3.2:
> 
> > $ ./python -m timeit -s "x = b'x'*1" "x[:100]"
> > 1000 loops, best of 3: 0.134 usec per loop $ ./python -m timeit 
> > -s "x = memoryview(b'x'*1)" "x[:100]"
> > 1000 loops, best of 3: 0.151 usec per loop
> 
> That's weird.  The greedy slice needs two memory allocations.  One for 
> the ByteArray object itself, one for its cargo.  In total, more that
> 100 bytes.  In contrast, creating the MemoryView object requires only 
> one allocation of a few dozen bytes.

It's not a bytearray object, it's a bytes object. It requires only a single 
allocation since the data is allocated inline.
The memoryview object must also call getbuffer() again on the original object.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/kristjan%40ccpgames.com

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/kristjan%40ccpgames.com

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] new buffer in python2.7

2010-10-27 Thread Antoine Pitrou
On Wed, 27 Oct 2010 20:00:10 +0800
Kristján Valur Jónsson  wrote:
> Calling getbuffer on a bytearray or a bytes object should
> be really cheap, so I still don't accept this as a matter of fact
> situation.

It *is* cheap. It's just that copying a short slice is dirt cheap as
well.

Of course, it you manipulate 1 KB slices or longer, it's different.

Regards

Antoine.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] new buffer in python2.7

2010-10-27 Thread Kristján Valur Jónsson
Ah, well in 2.7 you don't have the luxury of a bytes object.  A str() would be 
similar in 2.7
Anyway, isn't the "bytes" object immutable? In that case, it is not a useful 
target for a sock.recv_into() call.
Calling getbuffer on a bytearray or a bytes object should be really cheap, so I 
still don't accept this as a matter of fact situation.

Btw, going to 3.2 isn't a real option for us any time soon.  A lot of companies 
probably find themselves in a similar situation.  This is why I spend so much 
effort applying some love to 2.7.  Most of the stuff I locally do to 2.7 I then 
contribute to python as 3.2 patches.  Someday they'll get backported, no doubt 
:)

K

-Original Message-
From: python-dev-bounces+kristjan=ccpgames@python.org 
[mailto:python-dev-bounces+kristjan=ccpgames@python.org] On Behalf Of 
Antoine Pitrou
Sent: Wednesday, October 27, 2010 19:16
To: python-dev@python.org
Subject: Re: [Python-Dev] new buffer in python2.7


> >Here are micro-benchmarks under 3.2:
> 
> > $ ./python -m timeit -s "x = b'x'*1" "x[:100]"
> > 1000 loops, best of 3: 0.134 usec per loop $ ./python -m timeit 
> > -s "x = memoryview(b'x'*1)" "x[:100]"
> > 1000 loops, best of 3: 0.151 usec per loop
> 
> That's weird.  The greedy slice needs two memory allocations.  One for 
> the ByteArray object itself, one for its cargo.  In total, more that
> 100 bytes.  In contrast, creating the MemoryView object requires only 
> one allocation of a few dozen bytes.

It's not a bytearray object, it's a bytes object. It requires only a single 
allocation since the data is allocated inline.
The memoryview object must also call getbuffer() again on the original object.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/kristjan%40ccpgames.com

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] new buffer in python2.7

2010-10-27 Thread Antoine Pitrou

> >Here are micro-benchmarks under 3.2:
> 
> > $ ./python -m timeit -s "x = b'x'*1" "x[:100]"
> > 1000 loops, best of 3: 0.134 usec per loop
> > $ ./python -m timeit -s "x = memoryview(b'x'*1)" "x[:100]"
> > 1000 loops, best of 3: 0.151 usec per loop
> 
> That's weird.  The greedy slice needs two memory allocations.  One for
> the ByteArray object itself, one for its cargo.  In total, more that
> 100 bytes.  In contrast, creating the MemoryView object requires only
> one allocation of a few dozen bytes.

It's not a bytearray object, it's a bytes object. It requires only a
single allocation since the data is allocated inline.
The memoryview object must also call getbuffer() again on the original
object.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] new buffer in python2.7

2010-10-27 Thread Kristján Valur Jónsson


-Original Message-
From: python-dev-bounces+kristjan=ccpgames@python.org 
[mailto:python-dev-bounces+kristjan=ccpgames@python.org] On Behalf Of 
Antoine Pitrou
Sent: Wednesday, October 27, 2010 18:36


>Here are micro-benchmarks under 3.2:

> $ ./python -m timeit -s "x = b'x'*1" "x[:100]"
> 1000 loops, best of 3: 0.134 usec per loop
> $ ./python -m timeit -s "x = memoryview(b'x'*1)" "x[:100]"
> 1000 loops, best of 3: 0.151 usec per loop

That's weird.  The greedy slice needs two memory allocations.  One for the 
ByteArray object itself, one for its cargo.  In total, more that 100 bytes.  In 
contrast, creating the MemoryView object requires only one allocation of a few 
dozen bytes.

The performance difference must come from some other weird overhead, such as 
initializing the new MemoryView object.

This would be pretty cool to profile using a proper profiler.  I'll see what my 
MS tools can come up with.

Meanwhile, a patch is in the tracker:

http://bugs.python.org/issue10212
Also this:
http://bugs.python.org/issue10211

There is a precedent of treating the failure to accept the Py_buffer interface 
as bugs in 2.7.  After all, this is a supported internal buffer.  See for 
example:
http://bugs.python.org/issue8104

K


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] new buffer in python2.7

2010-10-27 Thread Antoine Pitrou
On Wed, 27 Oct 2010 10:13:12 +0800
Kristján Valur Jónsson  wrote:
> Although 2.7 has the new buffer interface and memoryview
> objects, these are widely not accepted in the built in modules.

That's true, and slightly unfortunate. It could be a reason for
switching to 3.1/3.2 :-)

> IMHO this is unfortunate.  For example when doign network io, you would want 
> code like this:
> Buffer = bytearray(10)
> Socket.recv_into(Buffer)
> Header = struct.unpack("i", memoryview(Buffer)[:4])[0]

This can be an useless micro-optimization.

People are often misled by the implicit analogy with C. In Python,
a "lazy slice" still allocates memory for a whole new PyObject (for
example a memoryview). So lazy slices are only a win if they are
actually big (because a raw memcpy() is fast).

Actually, lazy slices can be *slower* if they instantatiate an object
whose allocation is less optimized than the built-in bytes object's.

Here are micro-benchmarks under 3.2:

$ ./python -m timeit -s "x = b'x'*1" "x[:100]"
1000 loops, best of 3: 0.134 usec per loop
$ ./python -m timeit -s "x = memoryview(b'x'*1)" "x[:100]"
1000 loops, best of 3: 0.151 usec per loop

$ ./python -m timeit -s "x = b'x'*1" "x[:1000]"
100 loops, best of 3: 0.228 usec per loop
$ ./python -m timeit -s "x = memoryview(b'x'*1)" "x[:1000]"
1000 loops, best of 3: 0.151 usec per loop

So, as you see, creating a 100-byte slice from a 10 KB bytestring is
faster when using normal (eager) slices.
It becomes slower when creating a 1KB slice, but is still very fast
(under one microsecond).

> Not forgetggin the StringI object in cStringIO.
> IMHO, not accepting buffers by these objects can be consided a bug,
> that needs fixing.

It is often tempting to say that a "necessary" feature is a bug, but
it's a slippery slope. I would say it's only a bug when it's been
documented to work. I don't think StringIO objects have ever supported
the buffer protocol. In 3.2, though, you can use the
BytesIO.getbuffer() method:
http://docs.python.org/dev/library/io.html#io.BytesIO.getbuffer

(another reason to switch perhaps :-))

In any case, I think it should be the release manager's decision here.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] new buffer in python2.7

2010-10-27 Thread Ulrich Eckhardt
On Wednesday 27 October 2010, Kristján Valur Jónsson wrote:
> Although 2.7 has the new buffer interface and memoryview objects, these are
> widely not accepted in the built in modules. Examples are the structmodule,
> some of the socketmodule apis, structmodule, etc.
>
> IMHO this is unfortunate.  For example when doign network io, you would
> want code like this: Buffer = bytearray(10)
> Socket.recv_into(Buffer)
> Header = struct.unpack("i", memoryview(Buffer)[:4])[0]

Actually I would like code like
  s = socket()
  ...
  header = struct.unpack("i", s)

In other words, struct should interact with files/streams directly, instead of 
requiring me to first read a chunk who's size I manually have to determine 
etc.

Otherwise, I'm +1 on your suggestion, avoiding copying is a good thing.

Uli
(who's going to shut up, because he doesn't have a patch either)

-- 
Sator Laser GmbH, Fangdieckstraße 75a, 22547 Hamburg, Deutschland
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932

**
Sator Laser GmbH, Fangdieckstraße 75a, 22547 Hamburg, Deutschland
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932
**
   Visit our website at 
**
Diese E-Mail einschließlich sämtlicher Anhänge ist nur für den Adressaten 
bestimmt und kann vertrauliche Informationen enthalten. Bitte benachrichtigen 
Sie den Absender umgehend, falls Sie nicht der beabsichtigte Empfänger sein 
sollten. Die E-Mail ist in diesem Fall zu löschen und darf weder gelesen, 
weitergeleitet, veröffentlicht oder anderweitig benutzt werden.
E-Mails können durch Dritte gelesen werden und Viren sowie nichtautorisierte 
Änderungen enthalten. Sator Laser GmbH ist für diese Folgen nicht 
verantwortlich.
**

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] new buffer in python2.7

2010-10-26 Thread Kristján Valur Jónsson
Not forgetggin the StringI object in cStringIO.
IMHO, not accepting buffers by these objects can be consided a bug, that needs 
fixing.
K

From: python-dev-bounces+kristjan=ccpgames@python.org 
[mailto:python-dev-bounces+kristjan=ccpgames@python.org] On Behalf Of 
Kristján Valur Jónsson
Sent: Wednesday, October 27, 2010 10:13
To: Python-Dev (python-dev@python.org)
Subject: [Python-Dev] new buffer in python2.7

Although 2.7 has the new buffer interface and memoryview objects, these are 
widely not accepted in the built in modules.
Examples are the structmodule, some of the socketmodule apis, structmodule, etc.

IMHO this is unfortunate.  For example when doign network io, you would want 
code like this:
Buffer = bytearray(10)
Socket.recv_into(Buffer)
Header = struct.unpack("i", memoryview(Buffer)[:4])[0]

In other words, you want to minimize coping the data around.  Currently in 2.7 
you have to cast the memoryview to str() which copies the data.  In 3.0 you 
don't.
Is there any chance of getting changes like these in ?
(swapping s# for s* in PyArg_ParseTuple in a few strategic places)

My current list of places would be:
_strucmodule.c
arraymodule.c
zlibmodule.c
marshal.c

audioop.c and imageop.c are probably less important.
K
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] new buffer in python2.7

2010-10-26 Thread Kristján Valur Jónsson
Although 2.7 has the new buffer interface and memoryview objects, these are 
widely not accepted in the built in modules.
Examples are the structmodule, some of the socketmodule apis, structmodule, etc.

IMHO this is unfortunate.  For example when doign network io, you would want 
code like this:
Buffer = bytearray(10)
Socket.recv_into(Buffer)
Header = struct.unpack("i", memoryview(Buffer)[:4])[0]

In other words, you want to minimize coping the data around.  Currently in 2.7 
you have to cast the memoryview to str() which copies the data.  In 3.0 you 
don't.
Is there any chance of getting changes like these in ?
(swapping s# for s* in PyArg_ParseTuple in a few strategic places)

My current list of places would be:
_strucmodule.c
arraymodule.c
zlibmodule.c
marshal.c

audioop.c and imageop.c are probably less important.
K
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com