Bugs item #1257525, was opened at 2005-08-12 12:22 Message generated for change (Comment added) made by exa You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1257525&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Unicode Group: Python 2.4 Status: Open Resolution: None Priority: 5 Submitted By: liturgist (liturgist) Assigned to: M.-A. Lemburg (lemburg) Summary: Encodings iso8859_1 and latin_1 are redundant Initial Comment: ./lib/encodings contains both: iso8859_1.py latin_1.py Only one should be present. Martin says that latin_1 is faster. Using the 'iso' name would correlate better with the other ISO encodings provided. If the latin_1 code is faster, then it should be in the iso8859_1.py file. If an automated process produces the 'iso*' encodings, then it should either produce the faster code or stop producing iso8859_1. Regardless, one of the files should be removed. ---------------------------------------------------------------------- Comment By: Eray Ozkural (exa) Date: 2005-10-11 21:22 Message: Logged In: YES user_id=1454 i understand that there ought to be one fast implementation, but i suppose both names should be available. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2005-08-12 14:30 Message: Logged In: YES user_id=38388 To answer your questions: Yes, the encoding is the same for both latin-1 and iso8859-1. Specifying latin-1 instead of iso8859-1 will allow the code to use short-cuts. You have to grep for 'latin-1'. ---------------------------------------------------------------------- Comment By: liturgist (liturgist) Date: 2005-08-12 14:01 Message: Logged In: YES user_id=197677 Where could one see some of the "shortcuts" in the Unicode integration code that make using "latin_1" faster in the runtime? I greped *.py and *.c, but could not readily identify any candidates. ---------------------------------------------------------------------- Comment By: liturgist (liturgist) Date: 2005-08-12 13:12 Message: Logged In: YES user_id=197677 Ok. How about if we specify iso8859_1 as "(see latin_1)" in the documentation? The code will work the same regardless of which encoding name the developer uses. Right? ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2005-08-12 12:49 Message: Logged In: YES user_id=38388 Good point. The iso8859_1.py codec should be removed and added as alias to latin-1. Martin is right: the latin-1 codec is not only faster, but the Unicode integration code also has a lot of short-cuts for the "latin-1" encoding, so overall performance is better if you use that name for the encoding. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1257525&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com