Andrew McNamara wrote:
>
> I'm not altogether sure there. The parsing state machine is all
> written in C, and deals with signed chars - I expect we'll need two
> versions of that (or one version that's compiled twice using
> pre-processor macros). Quite a large job. Suggestions gratefully
> recei
Andrew McNamara wrote:
Marc-Andre Lemburg mentioned that he has encountered UTF-16 encoded csv
files, so a reasonable starting point would be the ability to read and
parse, as well as the ability to generate, one of these.
I see. That would be reasonable, indeed. Notice that this is not so much
a "
>>I'm still trying to understand what *needs* to be done - I would move to
>>how this is done only later. What APIs should be extended/changed, and
>>in what way?
[...]
>The reader interface currently returns a row at a time, consuming as many
>lines from the supplied iterable (with the most common
Can you please elaborate on that? What needs to be done, and how is
that going to be done? It might be possible to avoid considerable
uglification.
>>
>> I'm not altogether sure there. The parsing state machine is all written in
>> C, and deals with signed chars - I expect we'll need t
Andrew McNamara wrote:
Can you please elaborate on that? What needs to be done, and how is
that going to be done? It might be possible to avoid considerable
uglification.
I'm not altogether sure there. The parsing state machine is all written in
C, and deals with signed chars - I expect we'll need
>> Yep, although that means we wear the cost of decoding and encoding for
>> all 8 bit input.
>
>Right, but it makes the code very clean and straight forward.
I agree it makes for a very clean solution, and 99% of the time I'd
chose that option.
>Again, it depends on what you need. If performance
Andrew McNamara wrote:
Yes, although it would be nice to also retain the 8-bit versions as well.
You can do so by using latin-1 as default encoding. Works great !
Yep, although that means we wear the cost of decoding and encoding for
all 8 bit input.
Right, but it makes the code very clean and stra
>> Yes, although it would be nice to also retain the 8-bit versions as well.
>
>You can do so by using latin-1 as default encoding. Works great !
Yep, although that means we wear the cost of decoding and encoding for
all 8 bit input.
What does the _sre.c code do?
>Depends on your needs: CSV file
Andrew McNamara wrote:
Andrew McNamara wrote:
There's a bunch of jobs we (CSV module maintainers) have been putting
off - attached is a list (in no particular order):
* unicode support (this will probably uglify the code considerably).
Martin v. Löwis wrote:
Can you please elaborate on that? What
>> Andrew McNamara wrote:
>>> There's a bunch of jobs we (CSV module maintainers) have been putting
>>> off - attached is a list (in no particular order):
>>> * unicode support (this will probably uglify the code considerably).
>>
>Martin v. Löwis wrote:
>> Can you please elaborate on that? What n
Martin v. Löwis wrote:
Andrew McNamara wrote:
There's a bunch of jobs we (CSV module maintainers) have been putting
off - attached is a list (in no particular order):
* unicode support (this will probably uglify the code considerably).
Can you please elaborate on that? What needs to be done, and h
Andrew McNamara wrote:
There's a bunch of jobs we (CSV module maintainers) have been putting
off - attached is a list (in no particular order):
* unicode support (this will probably uglify the code considerably).
Can you please elaborate on that? What needs to be done, and how is
that going to be
There's a bunch of jobs we (CSV module maintainers) have been putting
off - attached is a list (in no particular order):
* unicode support (this will probably uglify the code considerably).
* 8 bit transparency (specifically, allow \0 characters in source string
and as delimiters, etc).
* Rea
13 matches
Mail list logo