On 09.01.10 01:47, Glenn Linderman wrote: > On approximately 1/8/2010 3:59 PM, came the following characters from > the keyboard of Victor Stinner: >> Hi, >> >> Thanks for all the answers! I will try to sum up all ideas here. > > One concern I have with this implementation encoding="BOM" is that if > there is no BOM it assumes UTF-8. That is probably a good assumption in > some circumstances, but not in others. > > * It is not required that UTF-16LE, UTF-16BE, UTF-32LE, or UTF-32BE > encoded files include a BOM. It is only required that UTF-16 and UTF-32 > (cases where the endianness is unspecified) contain a BOM. Hence, it > might be that someone would expect a UTF-16LE (or any of the formats > that don't require a BOM, rather than UTF-8), but be willing to accept > any BOM-discriminated format. > > * Potentially, this could be expanded beyond the various Unicode > encodings... one could envision that a program whose data files > historically were in any particular national language locale, could want > to be enhance to accept Unicode, and could declare that they will accept > any BOM-discriminated format, but want to default, in the absence of a > BOM, to the original national language locale that they historically > accepted. That would provide a migration path for their old data files. > > So the point is, that it might be nice to have > "BOM-otherEncodingForDefault" for each other encoding that Python > supports. Not sure that is the right API, but I think it is expressive > enough to handle the cases above. Whether the cases solve actual > problems or not, I couldn't say, but they seem like reasonable cases.
This is doable with the currect API. Simply define a codec search function that handles all encoding names that start with "BOM-" and pass the "otherEncodingForDefault" part along to the codec. > It would, of course, be nicest if OS metadata had been invented way back > when, for all OSes, such that all text files were flagged with their > encoding... then languages could just read the encoding and do the right > thing! But we live in the real world, instead. Servus, Walter _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com