Re: how to do reading of binary files?
jvdb schrieb: Hi all, I need some help on the following issue. I can't seem to solve it. I have a binary (pcl) file. In this file i want to search for specific codes (like 0C). I have tried to solve it by reading the file character by character, but this is very slow. Especially when it comes to files which are large (10MB) this is consuming quite some time. Does anyone has a hint/clue/solution on this? What has the searching to do with the reading? 10MB easily fit into the main memory of a decent PC, so just do contents = open(file).read() # yes I know I should close the file... print contents.find('\x0c') Diez -- http://mail.python.org/mailman/listinfo/python-list
Re: how to do reading of binary files?
On 8 jun, 14:07, Diez B. Roggisch [EMAIL PROTECTED] wrote: jvdb schrieb: .. What has the searching to do with the reading? 10MB easily fit into the main memory of a decent PC, so just do contents = open(file).read() # yes I know I should close the file... print contents.find('\x0c') Diez True. But there is another issue attached to the one i wrote. When i know how much this occurs, i know the amount of pages in the file. After that i would like to be able to extract a given amount of data: file x contains 20 0C. then for example i would like to extract from instance 5 to instance 12 from the file. The reason why i want to do this: The 0C stands for a pagebreak in PCL language. This way i would be absle to extract a certain amount of pages from the file. -- http://mail.python.org/mailman/listinfo/python-list
how to do reading of binary files?
Hi all, I need some help on the following issue. I can't seem to solve it. I have a binary (pcl) file. In this file i want to search for specific codes (like 0C). I have tried to solve it by reading the file character by character, but this is very slow. Especially when it comes to files which are large (10MB) this is consuming quite some time. Does anyone has a hint/clue/solution on this? thanks already! Jeroen -- http://mail.python.org/mailman/listinfo/python-list
Re: how to do reading of binary files?
On 8 jun, 15:19, Marc 'BlackJack' Rintsch [EMAIL PROTECTED] wrote: In [EMAIL PROTECTED], Diez B. Roggisch wrote: jvdb schrieb: True. But there is another issue attached to the one i wrote. When i know how much this occurs, i know the amount of pages in the file. After that i would like to be able to extract a given amount of data: file x contains 20 0C. then for example i would like to extract from instance 5 to instance 12 from the file. The reason why i want to do this: The 0C stands for a pagebreak in PCL language. This way i would be absle to extract a certain amount of pages from the file. And? Finding the respective indices by using last_needle_position = 0 positions = [] while last_needle_position != -1: last_needle_position = contents.find(needle, last_needle_position+1) if last_needle_position != -1: positions.append(last_needle_position) will find all the pagepbreaks. then just slice contents appropriatly. Did you read the python tutorial? Maybe splitting at '\x0c', selecting/slicing the wanted pages and joining them again is enough, depending of the size of the files and memory of course. One problem I see is that '\x0c' may not always be the page end. It may occur in rastered image data too I guess. Ciao, Marc 'BlackJack' Rintsch Hi, your last comment is also something i have noticed. There are a number of occasions where this will happen. I also have to deal with this. I will dive into this on monday, after this hot weekend. cheers, Jeroen -- http://mail.python.org/mailman/listinfo/python-list
Re: how to do reading of binary files?
jvdb schrieb: On 8 jun, 14:07, Diez B. Roggisch [EMAIL PROTECTED] wrote: jvdb schrieb: .. What has the searching to do with the reading? 10MB easily fit into the main memory of a decent PC, so just do contents = open(file).read() # yes I know I should close the file... print contents.find('\x0c') Diez True. But there is another issue attached to the one i wrote. When i know how much this occurs, i know the amount of pages in the file. After that i would like to be able to extract a given amount of data: file x contains 20 0C. then for example i would like to extract from instance 5 to instance 12 from the file. The reason why i want to do this: The 0C stands for a pagebreak in PCL language. This way i would be absle to extract a certain amount of pages from the file. And? Finding the respective indices by using last_needle_position = 0 positions = [] while last_needle_position != -1: last_needle_position = contents.find(needle, last_needle_position+1) if last_needle_position != -1: positions.append(last_needle_position) will find all the pagepbreaks. then just slice contents appropriatly. Did you read the python tutorial? diez -- http://mail.python.org/mailman/listinfo/python-list
Re: how to do reading of binary files?
In [EMAIL PROTECTED], Diez B. Roggisch wrote: jvdb schrieb: True. But there is another issue attached to the one i wrote. When i know how much this occurs, i know the amount of pages in the file. After that i would like to be able to extract a given amount of data: file x contains 20 0C. then for example i would like to extract from instance 5 to instance 12 from the file. The reason why i want to do this: The 0C stands for a pagebreak in PCL language. This way i would be absle to extract a certain amount of pages from the file. And? Finding the respective indices by using last_needle_position = 0 positions = [] while last_needle_position != -1: last_needle_position = contents.find(needle, last_needle_position+1) if last_needle_position != -1: positions.append(last_needle_position) will find all the pagepbreaks. then just slice contents appropriatly. Did you read the python tutorial? Maybe splitting at '\x0c', selecting/slicing the wanted pages and joining them again is enough, depending of the size of the files and memory of course. One problem I see is that '\x0c' may not always be the page end. It may occur in rastered image data too I guess. Ciao, Marc 'BlackJack' Rintsch -- http://mail.python.org/mailman/listinfo/python-list
Re: how to do reading of binary files?
On 2007-06-08, jvdb [EMAIL PROTECTED] wrote: I have a binary (pcl) file. In this file i want to search for specific codes (like 0C). I have tried to solve it by reading the file character by character, but this is very slow. Especially when it comes to files which are large (10MB) this is consuming quite some time. Does anyone has a hint/clue/solution on this? I'd memmap the file. http://docs.python.org/lib/module-mmap.html If you prefer it to appear as an array of bytes instead of a string, the various numeric/array packags can do that. Numarray: http://stsdas.stsci.edu/numarray/numarray-1.5.html/module-numarray.memmap.html Vmaps: http://snafu.freedom.org/Vmaps/Vmaps.html Numpy: documentation is not free Since I can't point you to Numpy docs, here's a link to a newsgroup thread with an example for numpy: http://groups.google.com/group/comp.lang.python/browse_frm/thread/c63c3e281df99897/2336baa98386d5e7 -- Grant Edwards grante Yow! I like your SNOOPY at POSTER!! visi.com -- http://mail.python.org/mailman/listinfo/python-list
Re: how to do reading of binary files?
On Jun 8, 2:07 am, Diez B. Roggisch [EMAIL PROTECTED] wrote: ... What has the searching to do with the reading? 10MB easily fit into the main memory of a decent PC, so just do contents = open(file).read() # yes I know I should close the file... print contents.find('\x0c') Diez Better make that 'open(file, rb). -- http://mail.python.org/mailman/listinfo/python-list