Bugs item #1313051, was opened at 2005-10-04 12:37
Message generated for change (Comment added) made by tony_nelson
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1313051&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Tony Nelson (tony_nelson)
Assigned to: Nobody/Anonymous (nobody)
Summary: mac_roman codec missing "apple" codepoint

Initial Comment:
The mac_roman codec is missing a single codepoint for
the trademarked Apple logo (0xF0 <=> 0xF8FF per Apple
docs), which prevents round-tripping of mac_roman text
through Unicode.  Adding the codepoint as a private
encoding (per Apple) has no trademark implications,
only the character itself, in a font, would have such
issues.

I'm using Python 2.3, but AFAICT it is an issue in
later Python versions as well.

----------------------------------------------------------------------

>Comment By: Tony Nelson (tony_nelson)
Date: 2005-10-04 22:16

Message:
Logged In: YES 
user_id=1356214

>Tony: Python is not damaging your data - the codec will
>raise an exception in case that particular character is
>converted to Unicode.

Right, crashing the unsuspecting user's program and
destroying the data utterly.  Anyway, it doesn't damage /my/
data because I add the missing codepoint to the codec:

# Fix missing Apple logo in mac_roman.
import encodings.mac_roman
if not encodings.mac_roman.decoding_map[0xF0]:
    encodings.mac_roman.decoding_map[0xF0] = 0xF8FF
    encodings.mac_roman.encoding_map[0xF8FF] = 0xF0

It just damages data for all the other users of the codec.

>Please recreate the codec using gencodec.py (which you can
>find the Tools/ directory) and add it as attachement to this
>bug report. Thanks.

Umm, I take it you want me to download a mapping file first.
 Here is a new mac_roman.py.


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2005-10-04 17:48

Message:
Logged In: YES 
user_id=38388

Tony, comment like yours are not very helpful.

Python's codecs rely on facts defined by standards bodies,
e.g. the Unicode consortium, ISO, etc.. If you don't present
proof of your claim then there's nothing much we can do
about your particular problem.

Fortunately, proof isn't hard to find in this case:

http://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/ROMAN.TXT

Looks like Apple added the mapping sometime after the codec
was created.

Walter: it is common for companies to add their logos as
private Unicode characters. This happens a lot in the Asian
world. Of course, interop isn't great, but at least you
don't lose information by converting to Unicode.

Tony: Python is not damaging your data - the codec will
raise an exception in case that particular character is
converted to Unicode.

Please recreate the codec using gencodec.py (which you can
find the Tools/ directory) and add it as attachement to this
bug report. Thanks.


----------------------------------------------------------------------

Comment By: Tony Nelson (tony_nelson)
Date: 2005-10-04 16:41

Message:
Logged In: YES 
user_id=1356214

It isn't Python's job to tell people what characters they
are allowed to use.  Apple defined the codepoint and its
mapping to Unicode.  Python is not the Unicode Police, and
should not damage the data it was given just to prove a
point.  Damaging the user's data isn't very "batteries
included".

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2005-10-04 14:07

Message:
Logged In: YES 
user_id=89016

The codepoint 0xF8FF is in the Private Use Area, so this is
not an official Unicode character, and for other uses 0xF8FF
might mean something completely different. So I think this
mapping shouldn't be added to mac_roman.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1313051&group_id=5470
_______________________________________________
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to