Martin v. Löwis wrote: > M.-A. Lemburg wrote: > >>I've checked in a whole bunch of newly generated codecs >>which now make use of the faster charmap decoding variant added >>by Walter a short while ago. >> >>Please let me know if you find any problems. > > I think we should work on eliminating the decoding_map variables. > There are some codecs which rely on them being present in other codecs > (e.g. koi8_u.py is based on koi8_r.py); however, this could be updated > to use, say > > decoding_table = codecs.update_decoding_map(koi8_r.decoding_table, { > 0x00a4: 0x0454, # CYRILLIC SMALL LETTER UKRAINIAN IE > 0x00a6: 0x0456, # CYRILLIC SMALL LETTER > BYELORUSSIAN-UKRAINIAN I > 0x00a7: 0x0457, # CYRILLIC SMALL LETTER YI (UKRAINIAN) > 0x00ad: 0x0491, # CYRILLIC SMALL LETTER UKRAINIAN GHE > WITH UPTURN > 0x00b4: 0x0404, # CYRILLIC CAPITAL LETTER UKRAINIAN IE > 0x00b6: 0x0406, # CYRILLIC CAPITAL LETTER > BYELORUSSIAN-UKRAINIAN I > 0x00b7: 0x0407, # CYRILLIC CAPITAL LETTER YI (UKRAINIAN) > 0x00bd: 0x0490, # CYRILLIC CAPITAL LETTER UKRAINIAN GHE > WITH UPTURN > }) > > With all these cross-references gone, the decoding_maps could also go.
Why should koi_u.py be defined in terms of koi8_r.py anyway? Why not put a complete decoding_table into koi8_u.py? I'd like to suggest a small cosmetic change: gencodec.py should output byte values with two hexdigits instead of four. This makes it easier to see what is a byte values and what is a codepoint. And it would make grepping for stuff simpler. I.e. change: decoding_map.update({ 0x0080: 0x0402, # CYRILLIC CAPITAL LETTER DJE to decoding_map.update({ 0x80: 0x0402, # CYRILLIC CAPITAL LETTER DJE and decoding_table = ( u'\x00' # 0x0000 -> NULL to decoding_table = ( u'\x00' # 0x00 -> U+0000 NULL and encoding_map = { 0x0000: 0x0000, # NULL to encoding_map = { 0x0000: 0x00, # NULL _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com