On 08/19/2011 05:01 PM, Brent Pedersen wrote: > On Fri, Aug 19, 2011 at 7:29 AM, Jeremy Conlin<jlcon...@gmail.com> wrote: >> On Fri, Aug 19, 2011 at 7:19 AM, Pauli Virtanen<p...@iki.fi> wrote: >>> Fri, 19 Aug 2011 07:00:31 -0600, Jeremy Conlin wrote: >>>> I would like to use numpy's memmap on some data files I have. The first >>>> 12 or so lines of the files contain text (header information) and the >>>> remainder has the numerical data. Is there a way I can tell memmap to >>>> skip a specified number of lines instead of a number of bytes? >>> >>> First use standard Python I/O functions to determine the number of >>> bytes to skip at the beginning and the number of data items. Then pass >>> in `offset` and `shape` parameters to numpy.memmap. >> >> Thanks for that suggestion. However, I'm unfamiliar with the I/O >> functions you are referring to. Can you point me to do the >> documentation? >> >> Thanks again, >> Jeremy >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > this might get you started: > > > import numpy as np > > # make some fake data with 12 header lines. > with open('test.mm', 'w') as fhw: > print>> fhw, "\n".join('header' for i in range(12)) > np.arange(100, dtype=np.uint).tofile(fhw) > > # use normal python io to determine of offset after 12 lines. > with open('test.mm') as fhr: > for i in range(12): fhr.readline() > offset = fhr.tell()
I think that before reading a line the program should check whether the line starts with "#". Otherwise fhr.readline() may return a very large junk of data (may be the rest of the file content) that ought to be read only via memmap. HTH, Pearu _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion