In <[EMAIL PROTECTED]>, Diez B. Roggisch wrote: > jvdb schrieb: >> True. But there is another issue attached to the one i wrote. >> When i know how much this occurs, i know the amount of pages in the >> file. After that i would like to be able to extract a given amount of >> data: >> file x contains 20 <0C>. then for example i would like to extract from >> instance 5 to instance 12 from the file. >> The reason why i want to do this: The 0C stands for a pagebreak in PCL >> language. This way i would be absle to extract a certain amount of >> pages from the file. > > And? Finding the respective indices by using > > last_needle_position = 0 > positions = [] > while last_needle_position != -1: > last_needle_position = contents.find(needle, last_needle_position+1) > if last_needle_position != -1: > positions.append(last_needle_position) > > > will find all the pagepbreaks. then just slice contents appropriatly. > Did you read the python tutorial?
Maybe splitting at '\x0c', selecting/slicing the wanted pages and joining them again is enough, depending of the size of the files and memory of course. One problem I see is that '\x0c' may not always be the page end. It may occur in "rastered image" data too I guess. Ciao, Marc 'BlackJack' Rintsch -- http://mail.python.org/mailman/listinfo/python-list