Re: [Python-Dev] Backporting PEP 3101 to 2.6

2008-02-21 Thread Eric Smith
André Malo wrote:
> * Eric Smith wrote:
>> But now that I look at time.strftime in py3k, it's converting the entire
>> unicode string to a char string with PyUnicode_AsString, then converting
>> back with PyUnicode_Decode.
> 
> Looks wrong to me, too... :-)
> 
> nd

I don't understand Unicode encoding/decoding well enough to describe 
this bug, but I admit it looks suspicious.  Could someone who does 
understand it open a bug against 3.0 (hopefully with an example that fails)?

The bug should also mention that 2.6 avoids this problem entirely by not 
supporting unicode with strftime or datetime.__format__, but 2.6 could 
probably leverage whatever solution is developed for 3.0.

Thanks.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Backporting PEP 3101 to 2.6

2008-02-17 Thread André Malo
* Eric Smith wrote:

> André Malo wrote:
> > I guess, a clean and complete solution (besides re-implementing the
> > whole thing) would be to resolve each single format character with
> > strftime, decode according to the locale and re-assemble the result
> > string piece by piece. Doh!
>
> That's along the lines of what I was thinking.  strftime already does
> some of this to support %[zZ].
>
> But now that I look at time.strftime in py3k, it's converting the entire
> unicode string to a char string with PyUnicode_AsString, then converting
> back with PyUnicode_Decode.

Looks wrong to me, too... :-)

nd
-- 
$_=q?tvc!uif)%*|#Bopuifs!A`#~tvc!Xibu)%*|qsjou#Kvtu!A`#~tvc!KBQI!)*|~
tvc!ifmm)%*|#Qfsm!A`#~tvc!jt)%*|(Ibdlfs(~  # What the hell is JAPH? ;
@_=split/\s\s+#/;$_=(join''=>map{chr(ord(  # André Malo ;
$_)-1)}split//=>$_[0]).$_[1];s s.*s$_see;  #  http://www.perlig.de/ ;
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Backporting PEP 3101 to 2.6

2008-02-16 Thread Eric Smith
André Malo wrote:

> I guess, a clean and complete solution (besides re-implementing the whole 
> thing) would be to resolve each single format character with strftime, 
> decode according to the locale and re-assemble the result string piece by 
> piece. Doh!

That's along the lines of what I was thinking.  strftime already does 
some of this to support %[zZ].

But now that I look at time.strftime in py3k, it's converting the entire 
unicode string to a char string with PyUnicode_AsString, then converting 
back with PyUnicode_Decode.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Backporting PEP 3101 to 2.6

2008-02-16 Thread André Malo
* Nick Coghlan wrote:

> Eric Smith wrote:
> > The bad error message is a result of __format__ passing on unicode to
> > strftime.
> >
> > There are, of course, various ugly ways to work around this involving
> > nested format calls.
>
> I don't know if this fits your definition of "ugly workaround", but what
> if datetime.__format__ did something like:
>
>def __format__(self, spec):
>  encoding = None
>  if isinstance(spec, unicode):
>  encoding = 'utf-8'
>  spec = spec.encode(encoding)
>  result = strftime(spec, self)
>  if encoding is not None:
>  result = result.decode(encoding)
>  return result

Note that hardcoding utf-8 is a bad guess here as strftime(3) emits locale 
strings, so decoding will easily fail.

I guess, a clean and complete solution (besides re-implementing the whole 
thing) would be to resolve each single format character with strftime, 
decode according to the locale and re-assemble the result string piece by 
piece. Doh!

nd
-- 
> [...] weiß jemand zufällig, was der Tag DIV ausgeschrieben bedeutet?
DIVerses. Benannt nach all dem unstrukturierten Zeug, was die Leute da
so reinpacken und dann absolut positionieren ...
   -- Florian Hartig und Lars Kasper in dciwam
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Backporting PEP 3101 to 2.6

2008-02-16 Thread Nick Coghlan
Eric Smith wrote:
> The bad error message is a result of __format__ passing on unicode to 
> strftime.
> 
> There are, of course, various ugly ways to work around this involving 
> nested format calls.

I don't know if this fits your definition of "ugly workaround", but what 
if datetime.__format__ did something like:

   def __format__(self, spec):
 encoding = None
 if isinstance(spec, unicode):
 encoding = 'utf-8'
 spec = spec.encode(encoding)
 result = strftime(spec, self)
 if encoding is not None:
 result = result.decode(encoding)
 return result

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Backporting PEP 3101 to 2.6

2008-02-16 Thread Eric Smith
Eric Smith wrote:
>> Guido van Rossum wrote:
>>> For data types whose output uses only ASCII, would it be acceptable if
>>> they always returned an 8-bit string and left it up to the caller to
>>> convert it to Unicode? This would apply to all numeric types. (The
>>> date/time types have a strftime() style API which means the user must
>>> be able to specifiy Unicode.)

I'm finally getting around to finishing this up.  The approach I've 
taken for int, long, and float, is that they take either unicode or str 
format specifiers, and always return str results.  The builtin format() 
deals with converting str to unicode, if the format specifier was 
originally unicode.  This all works great.  It allows me to easily 
implement both ''.format and u''.format taking int, long, and float 
parameters.

I'm now working on datetime.  The __format__ method is really just a 
wrapper around strftime.  I was assuming (or rather hoping) that 
strftime does the right thing with unicode and str (unicode in = unicode 
out, str in = str out).  But it turns out strftime doesn't accept unicode:

$ ./python
Python 2.6a0 (trunk:60845M, Feb 15 2008, 21:09:57)
[GCC 4.1.2 20070626 (Red Hat 4.1.2-13)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
 >>> import datetime
 >>> datetime.date.today().strftime('%y')
'08'
 >>> datetime.date.today().strftime(u'%y')
Traceback (most recent call last):
   File "", line 1, in 
TypeError: strftime() argument 1 must be str, not unicode

As part of this task, I'm really not up to the job of changing strftime 
to support both str and unicode inputs.  So I think I'll put all of the 
__format__ code in place to support it if and when strftime supports 
unicode.  In the meantime, it won't be possible for u''.format to work 
with datetime objects.

 >>> 'year: {0:%y}'.format(datetime.date.today())
'year: 08'
 >>> u'year: {0:%y}'.format(datetime.date.today())
Traceback (most recent call last):
   File "", line 1, in 
TypeError: strftime() argument 1 must be str, not unicode

The bad error message is a result of __format__ passing on unicode to 
strftime.

There are, of course, various ugly ways to work around this involving 
nested format calls.

Maybe I'll extend strftime to unicode for the PyCon sprint.

Eric.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Backporting PEP 3101 to 2.6

2008-01-11 Thread Eric Smith
Steve Holden wrote:
> Nick Coghlan wrote:
>> To elaborate on this a bit (and handwaving a lot of important details 
>> out of the way) do you mean something like the following for the builtin 
>> format?:
>>
>> def format(obj, fmt_spec=None):
>>  if fmt_spec is None: fmt_spec=''
>>  result = obj.__format__(fmt_spec)
>>  if isinstance(fmt_spec, unicode):
>>  if isinstance(result, str):
>>  result = unicode(result)
>>  return result
>>
> Isn't unicode idempotent? Couldn't
> 
>   if isinstance(result, str):
>   result = unicode(result)
> 
> 
> avoid repeating in Python a test already made in C by re-spelling it as
> 
>  result = unicode(result)
> 
> or have you hand-waved away important details that mean the test really 
> is required?

This code is written in C.  It already has a check to verify that the 
return from __format__ is either str or unicode, and another check that 
fmt_spec is str or unicode.  So doing the conversion only if result is 
str and fmt_spec is unicode would be a cheap decision.

Good catch, though.  I wouldn't have thought of it, and there are parts 
that are written in Python, so maybe I can leverage this elsewhere.  Thanks!

Eric.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Backporting PEP 3101 to 2.6

2008-01-11 Thread Eric Smith
Nick Coghlan wrote:
> Guido van Rossum wrote:
>> For data types whose output uses only ASCII, would it be acceptable if
>> they always returned an 8-bit string and left it up to the caller to
>> convert it to Unicode? This would apply to all numeric types. (The
>> date/time types have a strftime() style API which means the user must
>> be able to specifiy Unicode.)
> 
> To elaborate on this a bit (and handwaving a lot of important details 
> out of the way) do you mean something like the following for the builtin 
> format?:
> 
> def format(obj, fmt_spec=None):
> if fmt_spec is None: fmt_spec=''
> result = obj.__format__(fmt_spec)
> if isinstance(fmt_spec, unicode):
> if isinstance(result, str):
> result = unicode(result)
> return result

That's the approach I'm taking.  The builtin format is the only caller 
of __format__ that I know of, so it's the only place this would need to 
be done.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Backporting PEP 3101 to 2.6

2008-01-11 Thread Steve Holden
Nick Coghlan wrote:
> Guido van Rossum wrote:
>> For data types whose output uses only ASCII, would it be acceptable if
>> they always returned an 8-bit string and left it up to the caller to
>> convert it to Unicode? This would apply to all numeric types. (The
>> date/time types have a strftime() style API which means the user must
>> be able to specifiy Unicode.)
> 
> To elaborate on this a bit (and handwaving a lot of important details 
> out of the way) do you mean something like the following for the builtin 
> format?:
> 
> def format(obj, fmt_spec=None):
>  if fmt_spec is None: fmt_spec=''
>  result = obj.__format__(fmt_spec)
>  if isinstance(fmt_spec, unicode):
>  if isinstance(result, str):
>  result = unicode(result)
>  return result
> 
Isn't unicode idempotent? Couldn't

  if isinstance(result, str):
  result = unicode(result)


avoid repeating in Python a test already made in C by re-spelling it as

 result = unicode(result)

or have you hand-waved away important details that mean the test really 
is required?

regards
  Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC  http://www.holdenweb.com/

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Backporting PEP 3101 to 2.6

2008-01-11 Thread Nick Coghlan
Guido van Rossum wrote:
> For data types whose output uses only ASCII, would it be acceptable if
> they always returned an 8-bit string and left it up to the caller to
> convert it to Unicode? This would apply to all numeric types. (The
> date/time types have a strftime() style API which means the user must
> be able to specifiy Unicode.)

To elaborate on this a bit (and handwaving a lot of important details 
out of the way) do you mean something like the following for the builtin 
format?:

def format(obj, fmt_spec=None):
 if fmt_spec is None: fmt_spec=''
 result = obj.__format__(fmt_spec)
 if isinstance(fmt_spec, unicode):
 if isinstance(result, str):
 result = unicode(result)
 return result

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Backporting PEP 3101 to 2.6

2008-01-10 Thread Eric Smith
Guido van Rossum wrote:
> On Jan 10, 2008 9:57 AM, Eric Smith <[EMAIL PROTECTED]> wrote:
>> Eric Smith wrote:
>>> 1: How should the builtin format() work?  It takes 2 parameters, an
>>> object o and a string s, and returns o.__format__(s).  If s is None, it
>>> returns o.__format__(empty_string).  In 3.0, the empty string is of
>>> course unicode.  For 2.6, should I use u'' or ''?
>> I just re-read PEP 3101, and it doesn't mention this behavior with None.
>>   The way the code actually works is that the specifier is optional, and
>> if it isn't present then it defaults to an empty string.  This behavior
>> isn't mentioned in the PEP, either.
>>
>> This feature came from a request from Talin[0].  We should either add
>> this to the PEP (and docs), or remove it.  If we document it, it should
>> mention the 2.x behavior (as other places in the PEP do).  If we removed
>> it, it would remove the one place in the backport that's not just hard,
>> but ambiguous.  I'd just as soon see the feature go away, myself.
> 
> IIUC, the 's' argument is the format specifier. Format specifiers are
> written in a very conservative character set, so I'm not sure it
> matters. Or are you assuming that the *type* of 's' also determines
> the type of the output?

Yes, 's' is the format specifier.  I should have used its actual name. 
I'm am saying that the type of 's' determines the type of the output. 
Maybe that's a needless assumption for the builtin format(), since it 
doesn't inspect the value of 's' (other than to verify its type).  But 
for ''.format() and u''.format(), I was thinking it will be true (but 
see below).

It just seems weird to me that the result of format(3, u'd') would be a 
'3', not u'3'.

> I may be in the minority here, but I think I like having a default for
> 's' (as implemented -- the PEP ought to be updated) and I also think
> it should default to an 8-bit string, assuming you support 8-bit
> strings at all -- after all in 2.x 8-bit strings are the default
> string type (as reflected by their name, 'str').

As long as it's defined, I'm okay with it.  I think making the 2.6 
default be an empty str is reasonable.

>>> 3: Every overridden __format__() method is going to have to check for
>>> string or unicode, just like object.__format() does, and return either a
>>> string or unicode object, appropriately.  I don't see any way around
>>> this, but I'd like to hear any thoughts.  I guess there aren't all that
>>> many __format__ methods that will be implemented, so this might not be a
>>> big burden.  I'll of course implement the built in ones.
>> The PEP actually mentions that this is how 2.x will have to work.  So
>> I'll go ahead and implement it that way, on the assumption that getting
>> string support into 2.6 is desirable.
> 
> I think it is. (But then I still live in a predominantly ASCII world.  :-)

I live in that same world, which is why I started implementing this to 
begin with!  I've always been more interested in the ascii version for 
2.6 than for the 3.0 unicode version.  Doing it first in 3.0 was my way 
of getting it into 2.6.

> For data types whose output uses only ASCII, would it be acceptable if
> they always returned an 8-bit string and left it up to the caller to
> convert it to Unicode? This would apply to all numeric types. (The
> date/time types have a strftime() style API which means the user must
> be able to specifiy Unicode.)

I guess in str.format() I could require the result of format(obj, 
format_spec) to be a str, and in unicode.format() I could convert it to 
be unicode, which would either succeed or fail.  I think all I need to 
do is have the numeric formatters work with both unicode and str format 
specifiers, and always return str results.  That should be doable. As 
you say, the format specifiers for the numerics are restricted to 8-bit 
strings, anyway.

Now that I think about it, the str .__format__() will also need to 
accept unicode and produce a str, for this to work:

u"{0}{1}{2}".format('a', u'b', 3)

I'll give these ideas a shot and see how far I get.  Thanks for the 
feedback!

Eric.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Backporting PEP 3101 to 2.6

2008-01-10 Thread Guido van Rossum
On Jan 10, 2008 9:57 AM, Eric Smith <[EMAIL PROTECTED]> wrote:
> Eric Smith wrote:
> > (I'm posting to python-dev, because this isn't strictly 3.0 related.
> > Hopefully most people read it in addition to python-3000).
> >
> > I'm working on backporting the changes I made for PEP 3101 (Advanced
> > String Formatting) to the trunk, in order to meet the pre-PyCon release
> > date for 2.6a1.
> >
> > I have a few questions about how I should handle str/unicode.  3.0 was
> > pretty easy, because everything was unicode.
> >
> > 1: How should the builtin format() work?  It takes 2 parameters, an
> > object o and a string s, and returns o.__format__(s).  If s is None, it
> > returns o.__format__(empty_string).  In 3.0, the empty string is of
> > course unicode.  For 2.6, should I use u'' or ''?
>
> I just re-read PEP 3101, and it doesn't mention this behavior with None.
>   The way the code actually works is that the specifier is optional, and
> if it isn't present then it defaults to an empty string.  This behavior
> isn't mentioned in the PEP, either.
>
> This feature came from a request from Talin[0].  We should either add
> this to the PEP (and docs), or remove it.  If we document it, it should
> mention the 2.x behavior (as other places in the PEP do).  If we removed
> it, it would remove the one place in the backport that's not just hard,
> but ambiguous.  I'd just as soon see the feature go away, myself.

IIUC, the 's' argument is the format specifier. Format specifiers are
written in a very conservative character set, so I'm not sure it
matters. Or are you assuming that the *type* of 's' also determines
the type of the output?

I may be in the minority here, but I think I like having a default for
's' (as implemented -- the PEP ought to be updated) and I also think
it should default to an 8-bit string, assuming you support 8-bit
strings at all -- after all in 2.x 8-bit strings are the default
string type (as reflected by their name, 'str').

> > 3: Every overridden __format__() method is going to have to check for
> > string or unicode, just like object.__format() does, and return either a
> > string or unicode object, appropriately.  I don't see any way around
> > this, but I'd like to hear any thoughts.  I guess there aren't all that
> > many __format__ methods that will be implemented, so this might not be a
> > big burden.  I'll of course implement the built in ones.
>
> The PEP actually mentions that this is how 2.x will have to work.  So
> I'll go ahead and implement it that way, on the assumption that getting
> string support into 2.6 is desirable.

I think it is. (But then I still live in a predominantly ASCII world.  :-)

For data types whose output uses only ASCII, would it be acceptable if
they always returned an 8-bit string and left it up to the caller to
convert it to Unicode? This would apply to all numeric types. (The
date/time types have a strftime() style API which means the user must
be able to specifiy Unicode.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Backporting PEP 3101 to 2.6

2008-01-10 Thread Eric Smith
Eric Smith wrote:
> (I'm posting to python-dev, because this isn't strictly 3.0 related.
> Hopefully most people read it in addition to python-3000).
> 
> I'm working on backporting the changes I made for PEP 3101 (Advanced
> String Formatting) to the trunk, in order to meet the pre-PyCon release
> date for 2.6a1.
> 
> I have a few questions about how I should handle str/unicode.  3.0 was
> pretty easy, because everything was unicode.
> 
> 1: How should the builtin format() work?  It takes 2 parameters, an
> object o and a string s, and returns o.__format__(s).  If s is None, it
> returns o.__format__(empty_string).  In 3.0, the empty string is of
> course unicode.  For 2.6, should I use u'' or ''?

I just re-read PEP 3101, and it doesn't mention this behavior with None. 
  The way the code actually works is that the specifier is optional, and 
if it isn't present then it defaults to an empty string.  This behavior 
isn't mentioned in the PEP, either.

This feature came from a request from Talin[0].  We should either add 
this to the PEP (and docs), or remove it.  If we document it, it should 
mention the 2.x behavior (as other places in the PEP do).  If we removed 
it, it would remove the one place in the backport that's not just hard, 
but ambiguous.  I'd just as soon see the feature go away, myself.

> 3: Every overridden __format__() method is going to have to check for
> string or unicode, just like object.__format() does, and return either a
> string or unicode object, appropriately.  I don't see any way around
> this, but I'd like to hear any thoughts.  I guess there aren't all that
> many __format__ methods that will be implemented, so this might not be a
> big burden.  I'll of course implement the built in ones.

The PEP actually mentions that this is how 2.x will have to work.  So 
I'll go ahead and implement it that way, on the assumption that getting 
string support into 2.6 is desirable.

Eric.



[0] http://mail.python.org/pipermail/python-3000/2007-August/010089.html
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Backporting PEP 3101 to 2.6

2008-01-10 Thread Eric Smith
M.-A. Lemburg wrote:
> On 2008-01-10 14:31, Eric Smith wrote:
>> (I'm posting to python-dev, because this isn't strictly 3.0 related.
>> Hopefully most people read it in addition to python-3000).
>>
>> I'm working on backporting the changes I made for PEP 3101 (Advanced
>> String Formatting) to the trunk, in order to meet the pre-PyCon release
>> date for 2.6a1.
>>
>> I have a few questions about how I should handle str/unicode.  3.0 was
>> pretty easy, because everything was unicode.
> 
> Since this is a new feature, why bother with strings at all
> (even in 2.6) ?
> 
> Use Unicode throughout and be done with it.

I was hoping someone would say that!  It would certainly make things 
much easier.

But for my own selfish reasons, I'd like to have str.format() work in 
2.6.  Other than the issues I raised here, I've already done the vast 
majority of the work for the code to support either string or unicode. 
For example, I put most of the implementation in Objects/stringlib, so I 
can include it either as string or unicode.

But I can live with unicode only if that's the consensus.

Eric.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Backporting PEP 3101 to 2.6

2008-01-10 Thread Barry Warsaw
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Jan 10, 2008, at 9:07 AM, M.-A. Lemburg wrote:

> On 2008-01-10 14:31, Eric Smith wrote:
>> (I'm posting to python-dev, because this isn't strictly 3.0 related.
>> Hopefully most people read it in addition to python-3000).
>>
>> I'm working on backporting the changes I made for PEP 3101 (Advanced
>> String Formatting) to the trunk, in order to meet the pre-PyCon  
>> release
>> date for 2.6a1.
>>
>> I have a few questions about how I should handle str/unicode.  3.0  
>> was
>> pretty easy, because everything was unicode.
>
> Since this is a new feature, why bother with strings at all
> (even in 2.6) ?
>
> Use Unicode throughout and be done with it.

+1
- -Barry

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (Darwin)

iQCVAwUBR4YrpHEjvBPtnXfVAQJcgwP+PV+XsqtZZ2aFA4yxIYRzkVVCyk+rwFSN
H58DygPu4AQvhb1Dzuudag1OkfdpUHeRkvTyjSkUTWbK/03Y4R5A8X8iDkkQozQd
m92DynvSEIOtX3WJZT4SOvGj+QavQC4FmkTPlEPNwqBkIl4GkjfOnwMsKx2lwKN+
rOXUf7Mtvd8=
=1ME/
-END PGP SIGNATURE-
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Backporting PEP 3101 to 2.6

2008-01-10 Thread M.-A. Lemburg
On 2008-01-10 14:31, Eric Smith wrote:
> (I'm posting to python-dev, because this isn't strictly 3.0 related.
> Hopefully most people read it in addition to python-3000).
> 
> I'm working on backporting the changes I made for PEP 3101 (Advanced
> String Formatting) to the trunk, in order to meet the pre-PyCon release
> date for 2.6a1.
> 
> I have a few questions about how I should handle str/unicode.  3.0 was
> pretty easy, because everything was unicode.

Since this is a new feature, why bother with strings at all
(even in 2.6) ?

Use Unicode throughout and be done with it.

> 1: How should the builtin format() work?  It takes 2 parameters, an
> object o and a string s, and returns o.__format__(s).  If s is None, it
> returns o.__format__(empty_string).  In 3.0, the empty string is of
> course unicode.  For 2.6, should I use u'' or ''?
> 
> 
> 2: In 3.0, object.__format__() is essentially this:
> 
> class object:
> def __format__(self, format_spec):
> return format(str(self), format_spec)
> 
> In 2.6, I assume it should be the equivalent of:
> 
> class object:
> def __format__(self, format_spec):
> if isinstance(format_spec, str):
> return format(str(self), format_spec)
> elif isinstance(format_spec, unicode):
> return format(unicode(self), format_spec)
> else:
> error
> 
>  Does that seem right?
> 
> 
> 3: Every overridden __format__() method is going to have to check for
> string or unicode, just like object.__format() does, and return either a
> string or unicode object, appropriately.  I don't see any way around
> this, but I'd like to hear any thoughts.  I guess there aren't all that
> many __format__ methods that will be implemented, so this might not be a
> big burden.  I'll of course implement the built in ones.
> 
> Thanks in advance for any insights.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 10 2008)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Backporting PEP 3101 to 2.6

2008-01-10 Thread Eric Smith
(I'm posting to python-dev, because this isn't strictly 3.0 related.
Hopefully most people read it in addition to python-3000).

I'm working on backporting the changes I made for PEP 3101 (Advanced
String Formatting) to the trunk, in order to meet the pre-PyCon release
date for 2.6a1.

I have a few questions about how I should handle str/unicode.  3.0 was
pretty easy, because everything was unicode.

1: How should the builtin format() work?  It takes 2 parameters, an
object o and a string s, and returns o.__format__(s).  If s is None, it
returns o.__format__(empty_string).  In 3.0, the empty string is of
course unicode.  For 2.6, should I use u'' or ''?


2: In 3.0, object.__format__() is essentially this:

class object:
def __format__(self, format_spec):
return format(str(self), format_spec)

In 2.6, I assume it should be the equivalent of:

class object:
def __format__(self, format_spec):
if isinstance(format_spec, str):
return format(str(self), format_spec)
elif isinstance(format_spec, unicode):
return format(unicode(self), format_spec)
else:
error

 Does that seem right?


3: Every overridden __format__() method is going to have to check for
string or unicode, just like object.__format() does, and return either a
string or unicode object, appropriately.  I don't see any way around
this, but I'd like to hear any thoughts.  I guess there aren't all that
many __format__ methods that will be implemented, so this might not be a
big burden.  I'll of course implement the built in ones.

Thanks in advance for any insights.

Eric.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com