Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Derek Homeier
On 30.06.2014, at 23:10, Jeff Reback wrote: > In pandas 0.14.0, generic whitespace IS parsed via the c-parser, e.g. > specifying '\s+' as a separator. Not sure when you were playing last with > pandas, but the c-parser has been in place since late 2012. (version 0.8.0) > > http://pandas-docs.g

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Jeff Reback
In pandas 0.14.0, generic whitespace IS parsed via the c-parser, e.g. specifying '\s+' as a separator. Not sure when you were playing last with pandas, but the c-parser has been in place since late 2012. (version 0.8.0) http://pandas-docs.github.io/pandas-docs-travis/whatsnew.html#text-parsing-a

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Derek Homeier
On 30 Jun 2014, at 04:56 pm, Nathaniel Smith wrote: >> A real need, which had also been discussed at length, is a truly performant >> text IO >> function (i.e. one using a compiled ASCII number parser, and optimally also >> a more >> memory-efficient one), but unfortunately all people intereste

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Chris Barker
On Mon, Jun 30, 2014 at 9:31 AM, Nathaniel Smith wrote: > On 30 Jun 2014 17:05, "Chris Barker" wrote: > > > Anyway, this all ties in with the text file parsing issues... > > Only tangentially though :-) > well, a fast text parser (and "text mode") input file will either need to deal with Unico

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Nathaniel Smith
On 30 Jun 2014 17:05, "Chris Barker" wrote: >> >> It's also an interesting >> question whether they've fixed the unicode/binary issues, > > > Which brings up the "how do we handle text/strings in numpy? issue. We had a good thread going here about what the 'S' data type should be , what with py3 a

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Chris Barker
> > It's also an interesting > question whether they've fixed the unicode/binary issues, Which brings up the "how do we handle text/strings in numpy? issue. We had a good thread going here about what the 'S' data type should be , what with py3 and all, but I don't think we ever really resolved th

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Nathaniel Smith
On Mon, Jun 30, 2014 at 3:47 PM, Derek Homeier wrote: > Does it make sense to keep maintaing both functions at all? IIRC the idea that > loadtxt would be the faster version of the two has been discarded long ago, > thus it seems there is very little, if anything, loadtxt can do that cannot > be d

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Derek Homeier
On 30 Jun 2014, at 04:39 pm, Nathaniel Smith wrote: > On Mon, Jun 30, 2014 at 12:33 PM, Julian Taylor > wrote: >> genfromtxt and loadtxt need an almost full rewrite to fix the botched >> python3 conversion of these functions. There are a couple threads >> about this on this list already. >> The

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Nathaniel Smith
On Mon, Jun 30, 2014 at 12:33 PM, Julian Taylor wrote: > genfromtxt and loadtxt need an almost full rewrite to fix the botched > python3 conversion of these functions. There are a couple threads > about this on this list already. > There are numerous PRs fixing stuff in these functions which I > c

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Julian Taylor
genfromtxt and loadtxt need an almost full rewrite to fix the botched python3 conversion of these functions. There are a couple threads about this on this list already. There are numerous PRs fixing stuff in these functions which I currently all -1'd because we need to fix the underlying unicode is

[Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Derek Homeier
Hi all, I was just having a new look into the mess that is, imo, the support for automatic line ending recognition in genfromtxt, and more generally, the Python file openers. I am glad at least reading gzip files is no longer entirely broken in Python3, but actually detecting in particular “old