On Tue, Dec 13, 2011 at 11:29 AM, Bruce Southey <[email protected]> wrote:
> ** > Reading data is hard and writing code that suits the diversity in the > Numerical Python community is even harder! > > yup Both loadtxt and genfromtxt functions (other functions are perhaps less > important) perhaps need an upgrade to incorporate the new NA object. > yes, if we are satisfiedthat the new NA object is, in fact, the way of the future. > Here I think loadtxt is a better target than genfromtxt because, as I > understand it, it assumes the user really knows the data. Whereas > genfromtxt can ask the data for the appropriatye format. > > So I agree that new 'superfast custom CSV reader for well-behaved data' > function would be rather useful especially as an replacement for loadtxt. > By that I mean reading data using a user specified format that essentially > follows the CSV format ( > http://en.wikipedia.org/wiki/Comma-separated_values) - it needs are to > allow for NA object, skipping lines and user-defined delimiters. > > I think that ideally, there could be one interface to reading tabular data -- hopefully, it would be easy for the user to specify what the want, and if they don't the code tries to figure it out. Also, under the hood, the "easy" cases are special-cased to high-performing versions. genfromtxt sure looks close for an API -- it just needs the "high performance special cases" under the hood. It may be that the way it's designed makes it very difficult to do that, though -- I haven't looked closely enough to tell. At least that's what I'm thinking at the moment. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception [email protected]
_______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
