Jackie Lee wrote:
Hello there,

I have a 22 GB binary file, a want to change values of specific
positions. Because of the volume of the file, I doubt my code a
efficient one:

#! /usr/bin/env python
#coding=utf-8
import sys
import struct

try:
        f=open(sys.argv[1],'rb+')
except (IOError,Exception):
    print '''usage:
        scriptname segyfilename
'''
    sys.exit(1)

#skip EBCDIC header
try:
    f.seek(3200)
except Exception:
    print 'Oops! your file is broken..'

#read binary header
binhead = f.read(400)
ns = struct.unpack('>h',binhead[20:22])[0]
if ns < 0:
    print 'file read error'
    sys.exit(1)

#read trace header
while True:
    f.seek(28,1)
    f.write(struct.pack('>h',1))
    f.seek(212,1)
    f.seek(ns*4,1)

f.close()

I don't see a question anywhere. So perhaps you just want comments on your code.

1) How do you plan to test this?
2) Consider doing a lot more checking to see that you have in fact a file of the right type.
3) Fix indentation - perhaps you've accidentally used a tab in the source.
4) Provide a termination condition for the while True loop, which currently will (I think) go forever, or perhaps until the disk fills up. 5) Depending on the purpose of this file, you should consider making the changes on a copy, then deleting and renaming. As it stands, if the program gets aborted part way through, there's no way to know how far it got. Since it's just clobbering bytes, it would be safe to rerun the same program again, but many times that's not the case. And this program clearly isn't finished yet, so perhaps it's not true here either. 6) I don't see anything inefficient about it. The nature of the problem is going to be very slow (for small values of ns), but I don't know what your code could do to speed it up. Perhaps make sure the file is on a fast drive, and not RAID 5.

DaveA

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to