2008/4/16, Michael Urman <[EMAIL PROTECTED]>:
> I'll miss this, as I suspect the case of printing a list of unicode
> strings will be fairly common. Given Unicode identifiers, even print
> locals() could hit this. But perhaps tools for printing better
> summaries of the contents of lists and dic
Hello. Sorry for being a bit late in the discussion - my sysadmin has
problems setting up our DNS server so I could not send mail.
On Tue, Apr 15, 2008 at 06:07:46PM -0400, Terry Reedy wrote:
> import unirep
> print(*map(unirep.russian, objects))
>
> or even
>
> from unirep import rus_print
>
>
atsuo ishimoto wrote:
> Using repr() to build output string is common practice in Python world,
> so repr() is called everywhere in Python-core and third-party applications
> to print objects, emitting logs, etc.,.
>
> For example,
>
f = open("日本語")
> Traceback (most recent call last):
> F
On Wed, Apr 16, 2008 at 10:11:13PM +1000, Nick Coghlan wrote:
> atsuo ishimoto wrote:
> > IOError: [Errno 2] No such file or directory: '\u65e5\u672c\u8a9e'
>
> This is starting to seem to me more like something to be addressed
> through sys.displayhook/excepthook at the interactive interpreter l
Oleg Broytmann wrote:
> On Wed, Apr 16, 2008 at 10:11:13PM +1000, Nick Coghlan wrote:
>> atsuo ishimoto wrote:
>>> IOError: [Errno 2] No such file or directory: '\u65e5\u672c\u8a9e'
>> This is starting to seem to me more like something to be addressed
>> through sys.displayhook/excepthook at the i
On Wed, Apr 16, 2008 at 11:21:26PM +1000, Nick Coghlan wrote:
> Hmm, the io module along with sys.stdout/err may be a better way to
> attack the problem then. Given:
>
> import sys, io
>
> class ParseUnicodeEscapes(io.TextIOWrapper):
>def write(self, text):
> super().write(text.encode('
2008/4/16 Oleg Broytmann <[EMAIL PROTECTED]>:
>The problem manifests itself in scripts, too:
>
> Traceback (most recent call last):
> File "./ttt.py", line 4, in
> open("тест") # filename is in koi8-r encoding
> IOError: [Errno 2] No such file or directory: '\xd4\xc5\xd3\xd4'
Note tha
On Wed, Apr 16, 2008 at 07:26:36AM -0700, Guido van Rossum wrote:
> 2008/4/16 Oleg Broytmann <[EMAIL PROTECTED]>:
> >The problem manifests itself in scripts, too:
> >
> > Traceback (most recent call last):
> > File "./ttt.py", line 4, in
> > open("тест") # filename is in koi8-r encoding
Oleg Broytmann wrote:
> On Wed, Apr 16, 2008 at 11:21:26PM +1000, Nick Coghlan wrote:
>> You get:
>>
>> >>> "тест"
>> 'тест'
>> >>> open("тест")
>> Traceback (most recent call last):
>>File "", line 1, in
>>File "/home/ncoghlan/devel/py3k/Lib/io.py", line 212, in __new__
>> return o
I just had a shower, and I think it's cleared my thoughts a bit. :-)
Clearly this is an important problem to those in countries where ASCII
doesn't cut it. And just like in Python 3000 we're using UTF-8 as the
default source encoding and allowing Unicode letters in identifiers, I
think we should b
On Tue, Apr 15, 2008 at 10:30 PM, Greg Ewing
<[EMAIL PROTECTED]> wrote:
> Terry Reedy wrote:
> > import unirep
> > print(*map(unirep.russian, objects))
>
> That's okay if the objects are strings, but what about
> non-string objects that contain strings?
>
> We'd need another protocol, such as
[Jason Orendorff]
> Or have str.__repr__() respect per-thread settings, the way decimal
> arithmetic does.
I don't think that's a very compelling example. I have serious issues
with having global or per-thread state that can change the outcome of
repr(); it would make it impossible to write corr
On Wed, Apr 16, 2008 at 11:05 AM, Jason Orendorff
<[EMAIL PROTECTED]> wrote:
> There really are two use cases here: a human-readable repr for
> error/warning/log messages; and a machine-readable, always-the-same,
> ASCII-only repr. Users want to be able to tweak the former.
Does machine-readab
2008/4/16, Nick Coghlan <[EMAIL PROTECTED]>:
> Oleg Broytmann wrote:
> > On Wed, Apr 16, 2008 at 10:11:13PM +1000, Nick Coghlan wrote:
> >> atsuo ishimoto wrote:
> >>> IOError: [Errno 2] No such file or directory: '\u65e5\u672c\u8a9e'
> >> This is starting to seem to me more like something to b
2008/4/16, Guido van Rossum <[EMAIL PROTECTED]>:
> Note that this can be a feature too! You might have a filename that
> *looks* normal but contains a character from a different language --
> the \u encoding will show you the problem.
You won't call it a feature, if your *normal* encoding was ko
I changed my mind already. :-) See my post of this morning in another thread.
On Wed, Apr 16, 2008 at 4:09 PM, atsuo ishimoto <[EMAIL PROTECTED]> wrote:
> 2008/4/16, Guido van Rossum <[EMAIL PROTECTED]>:
>
> > Note that this can be a feature too! You might have a filename that
> > *looks* normal
2008/4/17, Guido van Rossum <[EMAIL PROTECTED]>:
> I changed my mind already. :-) See my post of this morning in another thread.
Ah, I missed the mail! Thank you.
___
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinf
I've reordered Guido's words.
Guido van Rossum writes:
> For those of us with less capable IO devices, setting the error flag
> for stdout and stderr to backslashreplace is probably the best
> solution (and it solves more problems than just repr()).
True. But it doesn't solve the ambiguity p
hello all,
A few times in practice I have been tripped up by how Python keeps
variables in scope after a loop--and it wasn't immediately obvious what the
problem was. I think it is one of the ugliest and non-intuitive features,
and hope some others agree that it should be changed in py3k.
>>>
Guido van Rossum wrote:
> The more I think about this, the more I believe that repr() should
> *not* be changed, and that instead we should give people who like to
> see '日本語' instead of '\u1234\u5678\u9abc' other tools to help
> themselves.
This seems to be a rather ASCII-centric way of thinking
Martin v. Löwis wrote:
> 3.6 byte
> addressable unit of data storage large enough to hold any
> member of the basic character set of the execution
> environment
Blarg. Well, I think the wording of that part of the
standard is braindamaged. The word "byte" already has
a pre-existing m
previous discussion at
http://mail.python.org/pipermail/python-dev/2005-September/056677.html
I don't agree with the author that
>>> i = 3
>>> for i in range(11): pass
...
>>> i
10
is much less confusing than i returning 3. furthermore, his C example makes
it obvious that "i" will be available in
Oleg Broytmann wrote:
> Do I understand it right that str(objects) calls repr() on items to
> properly quote strings? (str([1, '1']) must give "[1, '1']" as the result).
> Is it the only reason?
In the case of strings, yes. More generally, there
can be any kind of object in the list, and repr(x)
i
Oleg Broytmann wrote:
> Traceback (most recent call last):
> File "./ttt.py", line 4, in
> open("тест") # filename is in koi8-r encoding
> IOError: [Errno 2] No such file or directory: '\xd4\xc5\xd3\xd4'
In that particular case, I'd say the IOError constructor
is doing the wrong thing -- i
Nick Coghlan wrote:
> Unfortunately, it turns out that the trick also breaks display of
> strings containing any other escape codes.
There's also the worry that it could trigger falsely
on something that happened to look like \u but
didn't originate from the repr of a unicode char.
> I'm st
David Cournapeau wrote:
> They are totally different concepts: byte is not a (C) type, but a unit,
> the one returned by the sizeof operator.
If a word is needed for this concept, then invent a new
one, e.g. "size unit", rather than reusing "byte", which
everyone already understands as meaning 8
Greg Ewing wrote:
>
> Blarg. Well, I think the wording of that part of the
> standard is braindamaged. The word "byte" already has
> a pre-existing meaning outside of C, and the C standard
> shouldn't be redefining it for its own purposes.
>
> This is like a financial document that defines "dollar"
Greg Ewing wrote:
>
> If a word is needed for this concept, then invent a new
> one, e.g. "size unit", rather than reusing "byte", which
> everyone already understands as meaning 8 bits.
>
Maybe everyone understands it as 8 bits, but it has always been wrong.
Byte is a unit of storage, which o
On Apr 16, 2008, at 11:00 PM, Greg Ewing wrote:
> If a word is needed for this concept, then invent a new
> one, e.g. "size unit", rather than reusing "byte", which
> everyone already understands as meaning 8 bits.
Nope. Everyone understands "octet" to be 8 bits.
Bytes being exactly 8 bits is i
David Cournapeau wrote:
> Maybe everyone understands it as 8 bits, but it has always been wrong.
It may not be officially written down anywhere, but
almost everyone in the world understands a byte to mean
8 bits. When you go into a computer store and ask for
256MB of RAM, you don't expect to be a
On Wed, Apr 16, 2008 at 6:53 PM, Greg Ewing <[EMAIL PROTECTED]> wrote:
...
> > open("тест") # filename is in koi8-r encoding
> > IOError: [Errno 2] No such file or directory: '\xd4\xc5\xd3\xd4'
>
> In that particular case, I'd say the IOError constructor
> is doing the wrong thing -- it
Alex Martelli wrote:
> I disagree: I always recommend using %r to display (in an error
> message, log entry, etc), a string that may be in error,
For debugging messages, yes, but not output produced
in the normal course of operation. And "File Not Found"
I consider to be in the latter category --
On Wed, Apr 16, 2008 at 10:20 PM, Greg Ewing
<[EMAIL PROTECTED]> wrote:
> Alex Martelli wrote:
> > I disagree: I always recommend using %r to display (in an error
> > message, log entry, etc), a string that may be in error,
>
> For debugging messages, yes, but not output produced
> in the norma
On Wed, Apr 16, 2008 at 10:32 PM, Greg Ewing
<[EMAIL PROTECTED]> wrote:
> David Cournapeau wrote:
>
> > Maybe everyone understands it as 8 bits, but it has always been wrong.
>
> It may not be officially written down anywhere, but
> almost everyone in the world understands a byte to mean
> 8 bi
34 matches
Mail list logo