John Machin wrote:
On Oct 27, 7:15 am, Ethan Furman <et...@stoneleaf.us> wrote:
>
Let me rephrase -- say I get a dbf file with an LDID of \x0f that maps
to a cp437, and the file came from a german oem machine... could that
file have upper-ascii codes that will not map to anything reasonable on
my \x01 cp437 machine?  If so, is there anything I can do about it?

ASCII is defined over the first 128 codepoints; "upper-ascii codes" is
meaningless. As for the rest of your question, if the file's encoded
in cpXXX, it's encoded in cpXXX. If either the creator or the reader
or both are lying, then all bets are off.

My confusion is this -- is there a difference between any of the various cp437s? Going down the list at ESRI: 0x01, 0x09, 0x0b, 0x0d, 0x0f, 0x11, 0x15, 0x18, 0x19, and 0x1b all map to cp437, and they have names such as US, Dutch, Finnish, French, German, Italian, Swedish, Spanish, English (Britain & US)... are these all the same?


BTW, what are you planning to do with an LDID of 0x00?

Hmmm.  Well, logical choices seem to be either treating it as plain
ascii, and barfing when high-ascii shows up; defaulting to \x01; or
forcing the user to choose one on initial access.

It would be more useful to allow the user to specify an encoding than
an LDID.

I plan on using the same technique used in xlrd and xlwt, and allowing an encoding to be specified when the table is opened. If not specified, it will use whatever the table has in the LDID field.


You need to be able to read files created not only by software like
VFP or dBase but also scripts using third-party libraries. It would be
useful to allow an encoding to override an LDID that is incorrect e.g.
the LDID implies cp1251 but the data is actually encoded in koi8[ru]

Read this: http://en.wikipedia.org/wiki/Code_page_437
With no LDID in the file and no encoding supplied, I'd be inclined to
make it barf if any codepoint not in range(32, 128) showed up.

Sounds reasonable -- especially when the encoding can be overridden.

~Ethan~
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to