[issue6991] logging encoding failes some situation

2009-09-25 Thread Naoki INADA

Naoki INADA  added the comment:

OK, I agree.
Thank you for your answer.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6991] logging encoding failes some situation

2009-09-25 Thread Vinay Sajip

Vinay Sajip  added the comment:

It's not about logging - your first example (foo.py) didn't have any
logging code in it.

The problem is caused only when someone doesn't understand how Unicode
and codecs.open works, and logging can't fix this.

The rule is: If you use a stream without encoding and byte strings under
Python 2.x, you'll be OK - fine if you're using ASCII or Latin-1.
However, users of systems outside this (e.g. CJK or Cyrillic) will not
be covered.

For use anywhere, you really have to work in Unicode internally, decode
stuff on the way in and encode stuff on the way out. That's what the
codecs module is for.

If third-party libraries which you are using don't use Unicode properly,
then they are broken, and logging can't fix that. Any attempt to "paper
over the cracks" will fail sooner or later. It's better to identify the
problem exactly where it occurs: Python's Zen says "Errors should never
pass silently."

I'm closing this issue, as it's not really logging-related. Hope that's OK.

--
resolution:  -> invalid
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6991] logging encoding failes some situation

2009-09-25 Thread Naoki INADA

Naoki INADA  added the comment:

OK, you're right.
But logging is very basic feature and used very wide modules.
"All logging code should use unicode string" is right but difficult.

And logging may be used for debbuging usually. So I think
logging should write log as safe as possible.
When log.error(...) called, no-one wants UnicodeDecodeError
from logging in their log.

--
status: pending -> open

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6991] logging encoding failes some situation

2009-09-25 Thread Vinay Sajip

Vinay Sajip  added the comment:

Your second example (logging_error.py) fails for the same reason -
you're writing a byte-string to a stream which is expecting Unicode. The
error occurs in logging only it tries encoding as UTF-8 as a last-ditch
attempt - and that only happens because of an earlier exception caused
by you not writing a Unicode string.

In summary: If you open a stream via codecs.open, whether directly or
through the logging module, you are expecting the stream to do encoding
for you. Therefore, you only write Unicode to the stream - never a
byte-string. If you have a byte-string in your application which you
have obtained from somewhere else, convert it to Unicode using whatever
encoding applies to the source. Then, send the resulting Unicode to the
encoding stream (or logger).

--
status: open -> pending

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6991] logging encoding failes some situation

2009-09-25 Thread Naoki INADA

Naoki INADA  added the comment:

Another sample.

Traceback (most recent call last):
  File "C:\usr\Python2.6\lib\logging\__init__.py", line 790, in emit
stream.write(fs % msg.encode("UTF-8"))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xaa in position 0:
ordinal not in range(128)

This is because logging.FileHandler uses codecs.open internally.

--
status: pending -> open
Added file: http://bugs.python.org/file14973/logging_error.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6991] logging encoding failes some situation

2009-09-25 Thread Vinay Sajip

Vinay Sajip  added the comment:

There seems to be a problem with your foo.py. In it, you are writing a
byte-string to a stream returned from codecs.open. I don't think this is
correct: you should be writing a Unicode string to that stream, which
will convert to bytes using the stream's encoding, and write those bytes
to file. The following snippet illustrates:

>>> import codecs
>>> f = codecs.open('foo.txt', 'w', encoding='utf-8')
>>> f.write(u'\u76F4\u6A39\u7A32\u7530')
>>> f.close()
>>> f = open('foo.txt', 'r')
>>> f.read()
'\xe7\x9b\xb4\xe6\xa8\xb9\xe7\xa8\xb2\xe7\x94\xb0'

As you can see, the Unicode has been converted using UTF-8.

--
status: open -> pending

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6991] logging encoding failes some situation

2009-09-25 Thread Naoki INADA

Naoki INADA  added the comment:

Please see and execute an attached foo.py.

In Python 2.6.2, this cause following error:
>python foo.py
Traceback (most recent call last):
  File "foo.py", line 3, in 
f.write('\xaa')
  File "C:\usr\Python2.6\lib\codecs.py", line 686, in write
return self.writer.write(data)
  File "C:\usr\Python2.6\lib\codecs.py", line 351, in write
data, consumed = self.encode(object, self.errors)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xaa in position 0:
ordinal not in ran
ge(128)

--
status: pending -> open
Added file: http://bugs.python.org/file14971/foo.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6991] logging encoding failes some situation

2009-09-25 Thread Vinay Sajip

Vinay Sajip  added the comment:

Thanks, but I'm not sure I understand the reasoning.
stream.write(unicode_string) should not do decode() internally, though
of course it would do encode(). Can you explain a little more (with an
illustrative example) what problem you are trying to solve, and attach a
small script which shows the problem? Thanks.

--
status: open -> pending

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6991] logging encoding failes some situation

2009-09-25 Thread Benjamin Peterson

Changes by Benjamin Peterson :


--
assignee:  -> vinay.sajip
nosy: +vinay.sajip

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6991] logging encoding failes some situation

2009-09-25 Thread Naoki INADA

Changes by Naoki INADA :


--
type:  -> behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6991] logging encoding failes some situation

2009-09-24 Thread Naoki INADA

Changes by Naoki INADA :


--
versions: +Python 3.0, Python 3.1, Python 3.2

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6991] logging encoding failes some situation

2009-09-24 Thread Naoki INADA

New submission from Naoki INADA :

When stream is codecs.writer object, stream.write(string) does
string.decode() internally and it may cause UnicodeDecodeError.

Then, fallback to utf-8 is not good.
I think good fallback logic is:
* When message is unicode, message.encode(stream.encoding or 'ascii',
'backslashreplace')
* When message is bytes, message.encode('string_escape')

Attached patch contains this logic, refactoring and test.

--
components: Library (Lib)
files: logging_encode.patch
keywords: patch
messages: 93100
nosy: naoki
severity: normal
status: open
title: logging encoding failes some situation
versions: Python 2.6, Python 2.7
Added file: http://bugs.python.org/file14970/logging_encode.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com