[issue19087] bytearray front-slicing not optimized

2015-04-18 Thread Martin Panter

Martin Panter added the comment:

I think the changes for this issue are causing the crash and unexpected buffer 
expansion described in Issue 23985. Appending to a bytearray() can overstep the 
memory buffer because it doesn’t account for ob_start when checking for 
resizing. And “del” can expand the allocated memory due to an off-by-one error. 
Please have a look at my patches. Perhaps there are other operations that also 
need patching to account for ob_start.

--
nosy: +vadmium

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-10-05 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 499a96611baa by Antoine Pitrou in branch 'default':
Issue #19087: Improve bytearray allocation in order to allow cheap popping of 
data at the front (slice deletion).
http://hg.python.org/cpython/rev/499a96611baa

--
nosy: +python-dev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-10-05 Thread Antoine Pitrou

Antoine Pitrou added the comment:

The commit produced compiled errors on Windows, but I've since fixed them.

--
resolution:  - fixed
stage: patch review - committed/rejected
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-10-05 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Side effect of this change is that bytearray's data now can be non-aligned. We 
should examine all places which relies on this.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-10-05 Thread Antoine Pitrou

Antoine Pitrou added the comment:

 Side effect of this change is that bytearray's data now can be
 non-aligned. We should examine all places which relies on this.

The C API makes no guarantees as to alignment of private data areas, so
any external code relying on it would be incorrect.

The remaining question is whether the bytearray implementation relies on
it, but I don't think that's the case.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-10-04 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

I don't see much sense in differences between bytea_slice2.patch and 
bytea_slice3.patch, because bytea_slice3.patch is not smaller and simpler than 
bytea_slice2.patch.

I meant that you can continue use self-ob_bytes instead of 
PyByteArray_AS_STRING(self) if self-ob_bytes points not to the start of 
physical buffer, but to the start of logical byte array. *This* will simplify 
the patch a lot.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-10-04 Thread Antoine Pitrou

Antoine Pitrou added the comment:

 I meant that you can continue use self-ob_bytes instead of
 PyByteArray_AS_STRING(self) if self-ob_bytes points not to the
 start of physical buffer, but to the start of logical byte array.
 *This* will simplify the patch a lot.

It will make the diff smaller but it will not simplify the patch.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-10-03 Thread Antoine Pitrou

Antoine Pitrou added the comment:

Here is a slightly modified patch implementing Serhiy's suggestion.

--
Added file: http://bugs.python.org/file31954/bytea_slice3.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-10-03 Thread STINNER Victor

STINNER Victor added the comment:

bytea_slice3.patch looks simpler than  bytea_slice2.patch, I prefer it.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-30 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

 I'm not sure that it is what you expected: bytearray() is only initialized 
 once (setup of timeit). You probably want to reinitialize at each loop.

There is no setup of timeit here. And you forgot bytes(b) after accumulating 
loop. bench_bytearray.py shows me 10% slowdown for 10**3 and 10**5 bytes tests.

Of course it can be a measurement glitch. On other hand, there are no 
measurements which show positive effect of the patch for real code. Currently 
we consider only hypothetic code and can't compare it with alternatives.

 The problem is the suboptimal code is also the natural way to write such 
 code. If you know a simple and idiomatic way to write an optimal bytes FIFO, 
 then please share it with us.

Please share this written in the natural way real code with us. I can't 
compare with alternatives a code which I don't see.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-30 Thread Antoine Pitrou

Antoine Pitrou added the comment:

  The problem is the suboptimal code is also the natural way to
 write such code. If you know a simple and idiomatic way to write an
 optimal bytes FIFO, then please share it with us.
 
 Please share this written in the natural way real code with us. I
 can't compare with alternatives a code which I don't see.

I'm sorry, I don't want to spend more time on such a minor issue.  The
patch is simple and yields good benefits, and Victor seems to have
approved it, so I'm inclined to ignore your skepticism and commit.

I would have liked more constructive criticism (perhaps the patch is
inefficient or suboptimal, etc.), but it seems I'll have to do without
it.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-30 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

I don't understand why you avoid to show any examples which benefit. Shouldn't 
optimizing patches prove their efficient?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-30 Thread Antoine Pitrou

Antoine Pitrou added the comment:

 I don't understand why you avoid to show any examples which benefit.
 Shouldn't optimizing patches prove their efficient?

Many micro-optimizations get committed without proving themselves on a
high-level benchmark suite, as long as they produce a big enough
difference on micro-benchmarks. I think you have noticed that!

A 4x improvement on a micro-benchmark is very likely to make a
difference in at least some real-world code (while a 10% improvement
wouldn't).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-30 Thread Antoine Pitrou

Antoine Pitrou added the comment:

However, the patch had a bug in the resizing logic. Here is a new patch fixing 
that (+ an additional test).

--
Added file: http://bugs.python.org/file31926/bytea_slice2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-30 Thread Antoine Pitrou

Antoine Pitrou added the comment:

Other benchmarks for the new patch (exercising FIFO-like behaviour: some data 
is appended at one end, and popped at the other):

timeit -s b=bytearray(10);s=b'x'*100 b[:100] = b''; b.extend(s)
- before: 4.07 usec per loop
- after: 0.812 usec per loop

For comparison, popping from the end (LIFO-like):

timeit -s b=bytearray(10);s=b'x'*100 b[-100:] = b''; b.extend(s)
- before: 0.894 usec per loop
- after: 0.819 usec per loop

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-30 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

 A 4x improvement on a micro-benchmark is very likely to make a
difference in at least some real-world code (while a 10% improvement
wouldn't).

If there is a code that uses the deleting from the beginning of a bytearray. I 
just pray to demonstrate this code. Perhaps you only intend to write a code 
that will use it. Wonderful. I want to look at it and make sure that the same 
problem can not be solved just as effective in another way.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-30 Thread STINNER Victor

STINNER Victor added the comment:

I took me some time, but Antoine explained me the use case on IRC :-) The patch 
is useful is the bytearray is used as a FIFO: remove front, append tail. It can 
be seen as an implementation for BufferedReader. Consumer/producer is a common 
pattern, especially consuming one end (front) and produce at the other end 
(tail).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-30 Thread STINNER Victor

STINNER Victor added the comment:

I adapted my micro-benchmark to measure the speedup: bench_bytearray2.py. 
Result on  bytea_slice2.patch:

Common platform:
CFLAGS: -Wno-unused-result -Werror=declaration-after-statement -DNDEBUG -g 
-fwrapv -O3 -Wall -Wstrict-prototypes
CPU model: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
Timer info: namespace(adjustable=False, 
implementation='clock_gettime(CLOCK_MONOTONIC)', monotonic=True, 
resolution=1e-09)
Platform: Linux-3.9.4-200.fc18.x86_64-x86_64-with-fedora-18-Spherical_Cow
Python unicode implementation: PEP 393
Timer: time.perf_counter
Bits: int=32, long=64, long long=64, size_t=64, void*=64
Timer precision: 40 ns

Platform of campaign original:
Date: 2013-09-30 23:39:31
Python version: 3.4.0a2+ (default:687dd81cee3b, Sep 30 2013, 23:39:27) [GCC 
4.7.2 20121109 (Red Hat 4.7.2-8)]
SCM: hg revision=687dd81cee3b tag=tip branch=default date=2013-09-29 22:18 
+0200

Platform of campaign patched:
Date: 2013-09-30 23:38:55
Python version: 3.4.0a2+ (default:687dd81cee3b+, Sep 30 2013, 23:30:35) [GCC 
4.7.2 20121109 (Red Hat 4.7.2-8)]
SCM: hg revision=687dd81cee3b+ tag=tip branch=default date=2013-09-29 22:18 
+0200

+-+
non regression  |original | patched
+-+
concatenate 10**1 bytes |  1.1 us (*) | 1.14 us
concatenate 10**3 bytes | 46.9 us | 46.8 us (*)
concatenate 10**5 bytes | 4.66 ms (*) | 4.71 ms
concatenate 10**7 bytes |  478 ms (*) |  483 ms
+-+
Total   |  482 ms (*) |  488 ms
+-+

+---+-
deleting front, append tail |  original |  patched
+---+-
buffer 10**1 bytes  |639 ns (*) | 689 ns (+8%)
buffer 10**3 bytes  |682 ns (*) | 723 ns (+6%)
buffer 10**5 bytes  |   3.54 us (+428%) |   671 ns (*)
buffer 10**7 bytes  | 900 us (+107128%) |   840 ns (*)
+---+-
Total   |  905 us (+30877%) |  2.92 us (*)
+---+-

+--+
Summary | original | patched
+--+
non regression  |   482 ms (*) |  488 ms
deleting front, append tail | 905 us (+30877%) | 2.92 us (*)
+--+
Total   |   483 ms (*) |  488 ms
+--+

@Serhiy: I see zero difference in the append loop micro-benchmark. I added 
the final cast to bytes()

@Antoine: Your patch rocks, 30x faster! (I don't care of the 8% slowdown in the 
nanosecond timing).

--
Added file: http://bugs.python.org/file31929/bench_bytearray2.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread Antoine Pitrou

Antoine Pitrou added the comment:

Results under Windows:

- before:

PCbuild\amd64\python.exe -m timeit b=bytearray(10) while b: b[-1:] = b''
10 loops, best of 3: 74.8 msec per loop

PCbuild\amd64\python.exe -m timeit b=bytearray(10) while b: b[:1] = b''
10 loops, best of 3: 330 msec per loop

- after:

PCbuild\amd64\python.exe -m timeit b=bytearray(10) while b: b[-1:] = b''
10 loops, best of 3: 73.9 msec per loop

PCbuild\amd64\python.exe -m timeit b=bytearray(10) while b: b[:1] = b''
10 loops, best of 3: 73.8 msec per loop

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

 A generic example is to parse messages out of a TCP stream. Basically
any protocol transported on TCP needs such a facility, or has to find
workarounds (which are either suboptimal or complicated).

Could you please show concrete code in which you are going to use this 
optimization?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread Antoine Pitrou

Antoine Pitrou added the comment:

 Could you please show concrete code in which you are going to use this 
 optimization?

There's no need to use this optimization. Any networking code that has to 
find message boundaries and split on them will benefit. If you don't understand 
that, I'm not willing to explain it for you.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Deleting a slice at the front of a bytearray have linear complexity from the 
size of a bytearray (in any case del b[:1] is a little faster than b[:1] = 
b''). I doubt than any performance critical code do it instead of increasing an 
index in constant time.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread Antoine Pitrou

Antoine Pitrou added the comment:

 Deleting a slice at the front of a bytearray have linear complexity
 from the size of a bytearray (in any case del b[:1] is a little faster
 than b[:1] = b''). I doubt than any performance critical code do it
 instead of increasing an index in constant time.

Increasing an index requires that you compact the bytearray from time to
time, lest it fills the whole memory.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

The same is true with your patch.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread Antoine Pitrou

Antoine Pitrou added the comment:

 The same is true with your patch.

I don't understand. What is true with my patch?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

You increase internal index in a bytearray. A bytearray with small visible 
length can consume much hidden memory.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread Antoine Pitrou

Antoine Pitrou added the comment:

 You increase internal index in a bytearray. A bytearray with small
 visible length can consume much hidden memory.

No, because PyByteArray_Resize() is always called afterwards to ensure
that the buffer is resized when it gets below 50% usage.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

 No, because PyByteArray_Resize() is always called afterwards to ensure
 that the buffer is resized when it gets below 50% usage.


I.e. the bytearray is compacted from time to time.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread STINNER Victor

STINNER Victor added the comment:

@Serhiy: I doubt than any performance critical code do it instead of 
increasing an index in constant time.

Sorry, I don't get your point. It's not become Python is inefficient that 
developers must develop workarounds. Antoine's patch is simple, elegant, and 
offer better performances for free.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread Antoine Pitrou

Antoine Pitrou added the comment:

  No, because PyByteArray_Resize() is always called afterwards to ensure
  that the buffer is resized when it gets below 50% usage.
 
 I.e. the bytearray is compacted from time to time.

Is there a problem with that?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

 Is there a problem with that?

No more than with msg198657.

 Sorry, I don't get your point. It's not become Python is inefficient that 
 developers must develop workarounds.

I'm not sure that workarounds are much worst than using this optimization. At 
least we still not seen real code which will benefit from this optimization.

 Antoine's patch is simple, elegant, and offer better performances for free.

It offer better performances for free only for suboptimal code which 
currently have O(N) instead of O(1).

One of most used cases for bytearrays is accumulating. And the patch slow down 
this case.

$ ./python -m timeit  b = bytearray(); a = b'x'  for i in range(1): b += 
a  bytes(b)

Without patch: 4.3 msec per loop
With patch: 4.62 msec per loops

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread Antoine Pitrou

Antoine Pitrou added the comment:

 One of most used cases for bytearrays is accumulating.
 And the patch slow down this case.

I see no difference here. You are seeing a 10% slowdown, which is possibly a 
measurement glitch. The bottom line is that the performance remains 
approximately the same.

 It offer better performances for free only for suboptimal code
 which currently have O(N) instead of O(1).

The problem is the suboptimal code is also the natural way to write such 
code. If you know a simple and idiomatic way to write an optimal bytes FIFO, 
then please share it with us. Otherwise, I will happily ignore your line of 
argument here.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread STINNER Victor

STINNER Victor added the comment:

 One of most used cases for bytearrays is accumulating. And the patch slow 
 down this case.

Please don't use the raw timeit command for micro-benchmarks, it is not 
reliable. For example, I'm unable to reproduce your slow down (7% on a 
microbenchmark is not that large).

My micro-benchmark using my benchmark.py script:

Common platform:
Python version: 2.7.3 (default, Aug 9 2012, 17:23:57) [GCC 4.7.1 20120720 (Red 
Hat 4.7.1-5)]
Python unicode implementation: UCS-4
CPU model: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
Timer: time.time
Platform: Linux-3.9.4-200.fc18.x86_64-x86_64-with-fedora-18-Spherical_Cow
Timer precision: 954 ns
CFLAGS: -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 
-fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic 
-D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 
-fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic 
-D_GNU_SOURCE -fPIC -fwrapv
Bits: int=32, long=64, long long=64, size_t=64, void*=64

Platform of campaign original:
SCM: hg revision=687dd81cee3b tag=tip branch=default date=2013-09-29 22:18 
+0200
Date: 2013-09-30 00:59:42

Platform of campaign patched:
SCM: hg revision=687dd81cee3b+ tag=tip branch=default date=2013-09-29 22:18 
+0200
Date: 2013-09-30 00:59:07

+-+
Tests   |original | patched
+-+
10**1 bytes |  859 ns (*) |  864 ns
10**3 bytes | 55.8 us (*) | 56.4 us
10**5 bytes | 5.42 ms | 5.41 ms (*)
10**7 bytes |  578 ms |  563 ms (*)
+-+
Total   |  583 ms |  569 ms (*)
+-+

So performances are the same with the patch.

--
Added file: http://bugs.python.org/file31916/bench_bytearray.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread STINNER Victor

STINNER Victor added the comment:

Oh, by the way:

 $ ./python -m timeit  b = bytearray(); a = b'x'  for i in range(1): b 
 += a  bytes(b)

I'm not sure that it is what you expected: bytearray() is only initialized once 
(setup of timeit). You probably want to reinitialize at each loop.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-26 Thread STINNER Victor

STINNER Victor added the comment:

 Mercurial has another implementation strategy for a similar thing:
 http://selenic.com/repo/hg/file/50d721553198/mercurial/util.py#l935

I found an interesting comment in the following issue:

I think the trouble we get into is chunkbuffer() creates new large strings by 
concatenation and causes memory fragmentation. Keeping a list of chunks might 
be more efficient.

http://bz.selenic.com/show_bug.cgi?id=1842#c17

@Antoine: Do you know if your patch may reduce the memory fragmentation on 
bytearray front-slicing?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-26 Thread Antoine Pitrou

Antoine Pitrou added the comment:

 @Antoine: Do you know if your patch may reduce the memory
 fragmentation on bytearray front-slicing?

It reduces the number of allocations so, yes, it can reduce memory
fragmentation.
We cannot really use a list of chunks for bytearray since it is
supposed to be usable as a contiguous buffer (using the buffer API).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-26 Thread STINNER Victor

STINNER Victor added the comment:

Could you please add unit tests for check that ob_start is used instead of 
memmove()?

I didn't find a function for that in _testcapi. I tried to test it using 
sys.getsizeof(), but the size is not reliable (the bytearray buffer is not 
always shrinked, it depends on the new size).

The best is probably to add a new function in _testcapi to get private 
attributes: ob_exports, ob_alloc, ob_start, ob_bytes. Using these attributes, 
it becomes easy to check that fast-path are correctly optimized (eg. increases 
ob_start instead of getting a new ob_bytes buffer).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-26 Thread Antoine Pitrou

Antoine Pitrou added the comment:

 Could you please add unit tests for check that ob_start is used
 instead of memmove()?

How would I do that? Most of the time we don't unit-test performance
improvements (i.e. there are no tests that list.append() is O(1), for
example).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-25 Thread Antoine Pitrou

New submission from Antoine Pitrou:

If you delete a slice at the end of a bytearray, it is naturally optimized 
(thanks to the resizing strategy). However, if you delete a slice at the front 
of a bytearray, it is not: a memmove() gets done every time.

$ ./python -m timeit b=bytearray(1) while b: b[-1:] = b''
100 loops, best of 3: 5.67 msec per loop
$ ./python -m timeit b=bytearray(1) while b: b[:1] = b''
100 loops, best of 3: 6.67 msec per loop

$ ./python -m timeit b=bytearray(5) while b: b[-1:] = b''
10 loops, best of 3: 28.3 msec per loop
$ ./python -m timeit b=bytearray(5) while b: b[:1] = b''
10 loops, best of 3: 61.1 msec per loop

$ ./python -m timeit b=bytearray(10) while b: b[-1:] = b''
10 loops, best of 3: 59.4 msec per loop
$ ./python -m timeit b=bytearray(10) while b: b[:1] = b''
10 loops, best of 3: 198 msec per loop

This makes implementing a fifo using bytearray a bit suboptimal. It shouldn't 
be very hard to improve.

--
components: Interpreter Core
messages: 198385
nosy: pitrou
priority: normal
severity: normal
stage: needs patch
status: open
title: bytearray front-slicing not optimized
type: performance
versions: Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-25 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

And the same is for a list. List and bytearray are wrong types for front 
deleting. I don't think we should increase the size of bytearray, and 
complicate and slowdown it for such special purpose.

If you want to implement a fifo using bytearray more optimal, defer the 
deleting until used size less than a half of allocated size. See for example 
XMLPullParser.read_events() in Lib/xml/etree/ElementTree.py.

--
nosy: +serhiy.storchaka

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-25 Thread Antoine Pitrou

Antoine Pitrou added the comment:

 And the same is for a list. List and bytearray are wrong types for
 front deleting.

There is no bytedeque().

 I don't think we should increase the size of
 bytearray, and complicate and slowdown it for such special purpose.

I don't think it would really slow it down. It should be a simple
optimization. And FIFO buffers are quite common when writing parsers
for network applications.

 If you want to implement a fifo using bytearray more optimal, defer
 the deleting until used size less than a half of allocated size. See
 for example XMLPullParser.read_events() in
 Lib/xml/etree/ElementTree.py.

Of course, I wrote that code. Still, doing it manually is suboptimal
and cumbersome when it could be done transparently.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-25 Thread Antoine Pitrou

Antoine Pitrou added the comment:

Here is a patch. Benchmarks (under Linux where realloc is fast; the gap may be 
wider under Windows):

$ ./python -m timeit b=bytearray(10) while b: b[:1] = b''
- before: 225 msec per loop
- after: 60.4 msec per loop

$ ./python -m timeit b=bytearray(10) while b: b[:200] = b''
- before: 1.17 msec per loop
- after: 350 usec per loop

--
keywords: +patch
nosy: +haypo
Added file: http://bugs.python.org/file31873/bytea_slice.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-25 Thread Antoine Pitrou

Changes by Antoine Pitrou pit...@free.fr:


--
stage: needs patch - patch review

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-25 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Could you please provide an example which uses this feature?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19087] bytearray front-slicing not optimized

2013-09-25 Thread Antoine Pitrou

Antoine Pitrou added the comment:

 Could you please provide an example which uses this feature?

A generic example is to parse messages out of a TCP stream. Basically
any protocol transported on TCP needs such a facility, or has to find
workarounds (which are either suboptimal or complicated).

Mercurial has another implementation strategy for a similar thing:
http://selenic.com/repo/hg/file/50d721553198/mercurial/util.py#l935

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19087
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com