Readling a very large text file of coordinates and heights

Hanlie Pretorius Tue, 29 Mar 2011 01:34:04 -0700

Hi,

I'm working on Windows with Python 2.6.


I've received a 10m resolution DEM in xyz text format. The file is
about 1 GB in size. The file is too big to open in a text editor, such
as Notepad and I don't have Office 2007, so Excel cuts off the file
after 67 000 lines.

So, I need to write a Python script to to read this file and extract
only the data that falls within my study area. According to QGIS, the
extents of my area is:
xMin,yMin -66483.3,-3155672.31 : xMax,yMax -33474.9,-3122229.70

This is the first unprocessed line in the file, which I extracted using Python:
  -74289.694 -3182439.485  2092.029

The spacing between the lines are not consistent, which is another
reason why I need to manipulate the data so that GRASS can import it.

Reading the whole file at once causes a MemoryError in Python, so I've
written the following code to read it in chunks, with some help from
the web - <http://effbot.org/zone/readline-performance.htm>:

[code]
readfile='bethlehem.xyz'

file = open(readfile)

while 1:
    # read a chunck of the file
    lines = file.readlines(100000)
    if not lines:
        break
    for line in lines:
    # extract x, y and z
        x = line[2:12]
        y = line[13:25]
        z = line[27:35]
        if x >= -66483.300 and x <= -33474.900:
           if y >= -3155672.310 and y <= -3122229.700:
               print line
[/code]

This code runs for a (relatively) short while and exits having printed no lines.

My questions are thus:
1. Will this code iterate through the whole file, or does it read only
the first 100 000 bytes of text? If it reads only the first 100 000
bytes, how can I change it to read the while file in chunks?

2. Is the logic in my if statements correct to extract the values for
my study area? If not, how should I change it?

Thanks
Hanlie

Readling a very large text file of coordinates and heights

Reply via email to