Tim Roberts <t...@probo.com> wrote: > Jimmie He <jimmie...@gmail.com> wrote:
> >When I run the readbmp on an example.bmp(about 100k),the Shell is become to > >"No respose",when I change f.read() to f.read(1000),it is ok,could someone > >tell me the excat reason for this? > >Thank you in advance! > > > >Python Code as below!! > > > >import binascii > > > >def read_bmp(): > > f = open('example.bmp','rb') > > rawdata = f.read() #f.read(1000) is ok > > hexstr = binascii.b2a_hex(rawdata) #Get an HEX number > > bsstr = bin (int(hexstr,16))[2:] > I suspect the root of the problem here is that you don't understand what > this is actually doing. You should run this code in the command-line > interpreter, one line at a time, and print the results. > The "read" instruction produces a string with 100k bytes. The b2a_hex then > produces a string with 200k bytes. Then, int(hexstr,16) takes that 200,000 > byte hex string and converts it to an integer, roughly equal to 10 to the > 240,000 power, a number with some 240,000 decimal digits. You then convert > that integer to a binary string. That string will contain 800,000 bytes. > You then drop the first two characters and print the other 799,998 bytes, > each of which will be either '0' or '1'. > I am absolutely, positively convinced that's not what you wanted to do. > What point is there in printing out the binary equavalent of a bitmap? > Even if you did, it would be much quicker for you to do the conversion one > byte at a time, completely skipping the conversion to hex and then the > creation of a massive multi-precision number. Example: > f = open('example.bmp','rb') > rawdata = f.read() > bsstr = [] > for b in rawdata: > bsstr.append( bin(ord(b)) ) > bsstr = ''.join(bsstr) > or even: > f = open('example.bmp','rb') > bsstr = ''.join( bin(ord(b))[2:] for b in f.read() ) Exactly my idea at first. But then I started to time it (using the timeit module) by comparing the following functions: # Original version def c1( rawdata ) : h = binascii.b2a_hex( rawdata ) z = bin( int( h, 16 ) )[ 2 : ] return '0' * ( 8 * len( r ) - len( z ) ) + z # Convert each byte directly def c2( rawdata ) : return ''.join( bin( ord( x ) )[ 2 : ].rjust( 8, '0' ) for x in r ) # Convert each byte using a list for table look-up def c3( rawdata ) : h = [ bin( i )[ 2 : ].rjust( 8, '0' ) for i in range( 256 ) ] return ''.join( h[ ord( x ) ] for x in rawdata ) # Convert each byte using a dictionary for table look-up (avoids # lots of ord() calls) def c4( rawdata ) : h = { chr( i ) : bin( i )[ 2 : ].rjust( 8, '0' ) for i in range( 256 ) } return ''.join( h[ x ] for x in rawdata ) As you can see I even in c3() and c4() tried to speed things up further by using a table look-up instead if calling bin() etc. on each byte. But the results was that c2() is nearly 15 times slower than c1(), c3() about 3 times and c4() still more than 2 times slower! So the method the OP uses seems to be quite a bit more efficient than one might be tempted to assume. I would guess that the reason is that c1() does just a small number of calls of functions that probably aren't implemented in Python but in C and thus can be a lot faster then anything you could achieve with Python, while the other functions use a for loop in Python, which seems to account for a good part of the CPU time used. To test for that I split the 'rawdata' string into a list of character (i.e. single letter strings) and re- assembled it using join() and a for loop: r = list( rawdata( ) z = ''.join( x for x in r ) The second line alone took about 1.7 times longer than the whole, seemingly convoluted c1() function! What I take away from this is that a lot of the assumption one is prone to make when coming from e.g. a C/C++ background can be quite misleading when extrapolating to Python (or other in- terpreted languages)... Best regards, Jens -- \ Jens Thoms Toerring ___ j...@toerring.de \__________________________ http://toerring.de -- http://mail.python.org/mailman/listinfo/python-list