On Mon, 21 Apr 2008 16:10:05 -0700, George Sakkis wrote: > On Apr 21, 5:30 pm, Ivan Illarionov <[EMAIL PROTECTED]> wrote: > >> On 22 ÁÐÒ, 01:01, Peter Otten <[EMAIL PROTECTED]> wrote: >> >> > Ivan Illarionov wrote: >> > > And even faster: >> > > a = array.array('i', '\0' + '\0'.join((s[i:i+3] for i in xrange(0, >> > > len(s), 3)))) >> > > if sys.byteorder == 'little': >> > > a.byteswap() >> >> > > I think it's a fastest possible implementation in pure python >> >> > Clever, but note that it doesn't work correctly for negative numbers. >> > For those you'd have to prepend "\xff" instead of "\0". >> >> > Peter >> >> Thanks for correction. >> >> Another step is needed: >> >> a = array.array('i', '\0' + '\0'.join((s[i:i+3] for i in xrange(0, >> len(s), 3)))) >> if sys.byteorder == 'little': >> a.byteswap() >> result = [n if n < 0x800000 else n - 0x1000000 for n in a] >> >> And it's still pretty fast :) > > Indeed, the array idea is paying off for largeish inputs. On my box > (Python 2.5, WinXP, 2GHz Intel Core Duo), the cutoff point where > from3Bytes_array becomes faster than from3Bytes_struct is close to 150 > numbers (=450 bytes). > > The struct solution though is now almost twice as fast with Psyco > enabled, while the array doesn't benefit from it. Here are some numbers > from a sample run: > > *** Without Psyco *** > size=1 > from3Bytes_ord: 0.033493 > from3Bytes_struct: 0.018420 > from3Bytes_array: 0.089735 > size=10 > from3Bytes_ord: 0.140470 > from3Bytes_struct: 0.082326 > from3Bytes_array: 0.142459 > size=100 > from3Bytes_ord: 1.180831 > from3Bytes_struct: 0.664799 > from3Bytes_array: 0.690315 > size=1000 > from3Bytes_ord: 11.551990 > from3Bytes_struct: 6.390999 > from3Bytes_array: 5.781636 > *** With Psyco *** > size=1 > from3Bytes_ord: 0.039287 > from3Bytes_struct: 0.009453 > from3Bytes_array: 0.098512 > size=10 > from3Bytes_ord: 0.174362 > from3Bytes_struct: 0.045785 > from3Bytes_array: 0.162171 > size=100 > from3Bytes_ord: 1.437203 > from3Bytes_struct: 0.355930 > from3Bytes_array: 0.800527 > size=1000 > from3Bytes_ord: 14.248668 > from3Bytes_struct: 3.331309 > from3Bytes_array: 6.946709 > > > And here's the benchmark script: > > import struct > from array import array > > def from3Bytes_ord(s): > return [n if n<0x800000 else n-0x1000000 for n in > ((ord(s[i])<<16) | (ord(s[i+1])<<8) | ord(s[i+2]) > for i in xrange(0, len(s), 3))] > > unpack_i32be = struct.Struct('>l').unpack def from3Bytes_struct(s): > return [unpack_i32be(s[i:i+3] + '\0')[0]>>8 > for i in xrange(0,len(s),3)] > > def from3Bytes_array(s): > a = array('l', ''.join('\0' + s[i:i+3] > for i in xrange(0,len(s), 3))) > a.byteswap() > return [n if n<0x800000 else n-0x1000000 for n in a] > > > def benchmark(): > from timeit import Timer > for n in 1,10,100,1000: > print ' size=%d' % n > # cycle between positive and negative buf = > ''.join(struct.pack('>i', 1234567*(-1)**(i%2))[1:] > for i in xrange(n)) > for func in 'from3Bytes_ord', 'from3Bytes_struct', > 'from3Bytes_array': > print ' %s: %f' % (func, > Timer('%s(buf)' % func , > 'from __main__ import %s; buf=%r' % (func,buf) > ).timeit(10000)) > > > if __name__ == '__main__': > s = ''.join(struct.pack('>i',v)[1:] for v in > [0,1,-2,500,-500,7777,-7777,-94496,98765, > -98765,8388607,-8388607,-8388608,1234567]) > assert from3Bytes_ord(s) == from3Bytes_struct(s) == > from3Bytes_array(s) > > print '*** Without Psyco ***' > benchmark() > > import psyco; psyco.full() > print '*** With Psyco ***' > benchmark() > > > George
Comments: You didn't use the faster version of array approach: ''.join('\0' + s[i:i+3] for i in xrange(0,len(s), 3)) is slower than '\0' + '\0'.join(s[i:i+3] for i in xrange(0,len(s), 3)) To Bob Greschke: Struct is fast in Python 2.5 with struct.Struct class. Array approach should work with Python 2.3 and it's probably the fastest one (without psyco) with large inputs: def from3bytes_array(s): a = array.array('i', '\0' + '\0'.join([s[i:i+3] for i in xrange(0, len(s), 3)])) a.byteswap() # if your system is little-endian return [n >= 0x800000 and n - 0x1000000 or n for n in a] -- Ivan -- http://mail.python.org/mailman/listinfo/python-list