Thx, Dave, The code works fine. I just don't know how f.write works. It says that file.write won't write the file until file.close or file.flush. So I don't know if the following one is more efficient (sorry I forget to add condition to break the loop):
#! /usr/bin/env python #coding=utf-8 import sys import struct try: f=open(sys.argv[1],'rb+') except (IOError,Exception): print '''usage: scriptname segyfilename ''' sys.exit(1) #skip EBCDIC header try: f.seek(3200) except Exception: print 'Oops! your file is broken..' #read binary header binhead = f.read(400) ns = struct.unpack('>h',binhead[20:22])[0] if ns < 0: print 'file read error' sys.exit(1) #read trace header while True: f.seek(28,1) if f.read(2) == '': break f.seek(-2,1) f.write(struct.pack('>h',1)) f.seek(210,1) f.seek(ns*4,1) f.close() On Fri, May 14, 2010 at 6:04 PM, Dave Angel <da...@ieee.org> wrote: > Jackie Lee wrote: >> >> Hello there, >> >> I have a 22 GB binary file, a want to change values of specific >> positions. Because of the volume of the file, I doubt my code a >> efficient one: >> >> #! /usr/bin/env python >> #coding=utf-8 >> import sys >> import struct >> >> try: >> f=open(sys.argv[1],'rb+') >> except (IOError,Exception): >> print '''usage: >> scriptname segyfilename >> ''' >> sys.exit(1) >> >> #skip EBCDIC header >> try: >> f.seek(3200) >> except Exception: >> print 'Oops! your file is broken..' >> >> #read binary header >> binhead = f.read(400) >> ns = struct.unpack('>h',binhead[20:22])[0] >> if ns < 0: >> print 'file read error' >> sys.exit(1) >> >> #read trace header >> while True: >> f.seek(28,1) >> f.write(struct.pack('>h',1)) >> f.seek(212,1) >> f.seek(ns*4,1) >> >> f.close() >> >> > > I don't see a question anywhere. So perhaps you just want comments on your > code. > > 1) How do you plan to test this? > 2) Consider doing a lot more checking to see that you have in fact a file of > the right type. > 3) Fix indentation - perhaps you've accidentally used a tab in the source. > 4) Provide a termination condition for the while True loop, which currently > will (I think) go forever, or perhaps until the disk fills up. > 5) Depending on the purpose of this file, you should consider making the > changes on a copy, then deleting and renaming. As it stands, if the program > gets aborted part way through, there's no way to know how far it got. Since > it's just clobbering bytes, it would be safe to rerun the same program > again, but many times that's not the case. And this program clearly isn't > finished yet, so perhaps it's not true here either. > 6) I don't see anything inefficient about it. The nature of the problem is > going to be very slow (for small values of ns), but I don't know what your > code could do to speed it up. Perhaps make sure the file is on a fast > drive, and not RAID 5. > > DaveA > > -- Jackie -- http://mail.python.org/mailman/listinfo/python-list