In pandas 0.14.0, generic whitespace IS parsed via the c-parser, e.g. specifying '\s+' as a separator. Not sure when you were playing last with pandas, but the c-parser has been in place since late 2012. (version 0.8.0)
http://pandas-docs.github.io/pandas-docs-travis/whatsnew.html#text-parsing-api-changes > On Jun 30, 2014, at 4:58 PM, Derek Homeier > <de...@astro.physik.uni-goettingen.de> wrote: > > On 30 Jun 2014, at 04:56 pm, Nathaniel Smith <n...@pobox.com> wrote: > >>> A real need, which had also been discussed at length, is a truly performant >>> text IO >>> function (i.e. one using a compiled ASCII number parser, and optimally also >>> a more >>> memory-efficient one), but unfortunately all people interested in >>> implementing this >>> seem to have drifted away (not excluding myself from this)… >> >> It's possible we could steal some code from Pandas for this. IIRC they >> have C/Cython text parsing routines. (It's also an interesting >> question whether they've fixed the unicode/binary issues, might be >> worth checking before rewriting from scratch...) > > Good point, last time I was playing with Pandas it was not any faster, but > now a 10x > speedup speaks for itself. Their C engine does not support generic whitespace > separators, > but that could probably be addressed in a numpy implementation. > > Derek > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion