On 03/04/2013 22:55, Chris Angelico wrote:
On Thu, Apr 4, 2013 at 4:43 AM, Steven D'Aprano
<steve+comp.lang.pyt...@pearwood.info> wrote:
On Wed, 03 Apr 2013 10:38:20 -0600, Ian Kelly wrote:

On Wed, Apr 3, 2013 at 9:02 AM, Steven D'Aprano
<steve+comp.lang.pyt...@pearwood.info> wrote:
On Wed, 03 Apr 2013 09:43:06 -0400, Roy Smith wrote:

[...]
n = max(map(ord, s))
4 if n > 0xffff else 2 if n > 0xff else 1

This has to inspect the entire string, no?

Correct. A more efficient implementation would be:

def char_size(s):
     for n in map(ord, s):
         if n > 0xFFFF: return 4
         if n > 0xFF: return 2
     return 1

That's an incorrect implementation, as it would return 2 at the first
non-Latin-1 BMP character, even if there were SMP characters later in
the string.  It's only safe to short-circuit return 4, not 2 or 1.


Doh!

I mean, well done sir, you have successfully passed my little test!

Try this:

def str_width(s):
   width=1
   for ch in map(ord,s):
     if ch > 0xFFFF: return 4
     if cn > 0xFF: width=2
   return width

ChrisA


Given the quality of some code posted here recently this patch can't be accepted until there are some unit tests :)

--
If you're using GoogleCrap™ please read this http://wiki.python.org/moin/GoogleGroupsPython.

Mark Lawrence

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to