---- John Gunderman <[EMAIL PROTECTED]> wrote: > I am parsing the output of the mork.pl, which is a DORK (the mozilla format) > parser. I don't know Perl, so I decided to write a Python script to do what I > wanted, which basically is to create a dictionary listing each site and its > corresponding values instead of outputting into plaintext. Unfortunately, the > output of mork.pl is 5000+ lines so reading the whole document wouldn't be > that efficient.
If you have enough memory for it to fit, reading the whole file at once is fine. > Currently it uses: > for line in history_file.readlines(): > but I dont know if this has to read all lines before it goes through it. Yes, readlines() reads the entire file. > if it does, then would it be more efficient to use > while line != '/t': > line = history_file.readline() Probably not. But why so much emphasis on efficiency? Get the program working first. Only if it is too slow should you worry about efficiency. Processing a 5000-line file should not be a problem in Python. > I was thinking of just appending each character to the string until it sees > '/t', and then using int() on the string, but is there an easier way? It would really help to see a sample of the data and the results you want from it. There are many ways to parse data in Python, from simple string operations to regular expressions to full-blown parsers. Without knowing what you want to do it is impossible to suggest an appropriate method. Kent _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor