Re: [Tutor] unicode: alpha, whitespaces and digits

2013-12-29 Thread eryksun
On Sun, Dec 29, 2013 at 5:58 PM, Steven D'Aprano  wrote:
> If you want to test for something that a human reader will recognise as
> a "whole number", s.isdigit() is probably the best one to use.

isdigit() includes decimal digits plus other characters that have a digit value:

>>> print u'\N{superscript two}'
²
>>> unicodedata.digit(u'\N{superscript two}')
2

However, int(u'\N{superscript two}') raises a ValueError. int() only
accepts the subset of digits that are isdecimal().
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] unicode: alpha, whitespaces and digits

2013-12-29 Thread Steven D'Aprano
On Mon, Dec 30, 2013 at 09:58:10AM +1100, Steven D'Aprano wrote:

> What gives you that impression? isspace works on Unicode strings too.
> 
> py> '   x'.isspace()
> False
> py> ''.isspace()
> True

Oops, the above was copied and pasted from Python 3, which is why there 
are no u' prefixes. But it still holds in Python 2:

py> u''.isspace()
True
py> u'   x'.isspace()
False


Sorry for any confusion.


-- 
Steven
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] unicode: alpha, whitespaces and digits

2013-12-29 Thread Steven D'Aprano
On Sun, Dec 29, 2013 at 02:36:32PM +0100, Ulrich Goebel wrote:
> Hallo,
> 
> I have a unicode string s, for example u"abc", u"äöü", u"123" or 
> something else, and I have to find out wether
> 
> 1. s is not empty and contains only digits (as in u"123", but not in 
> u"3.1415")
>
> or
> 
> 2. s is empty or contains only whitespaces
>
> For all other cases I would assume a "normal" unicode string in s, 
> whatever that may be.
> 
> For the first case it could be s.isdigit(), s.isnumeric() or 
> s.isdecimal() - but which one is the best?

Depends what you are trying to do. Only you can decide which is best. 
The three methods do slightly different things:

- isdigit() tests for the digit characters 0...9, or their
  equivalent in whatever native language your computer is 
  using. 

- isdecimal() tests for decimal characters. That includes the
  so-called "Arabic numerals" 0...9 (although the Arabs don't
  use them!) as well as other decimal digits like ٠١٢...

  (The above three are ARABIC-INDIC DIGIT ZERO through TWO.)

- isnumeric() tests for characters which have the Unicode 
  numeric value property. That includes decimal digits, as well 
  as non-digit numbers such as ½ and ¾.


If you want to test for something that a human reader will recognise as 
a "whole number", s.isdigit() is probably the best one to use.


> For the second case it should be s.isspace(), but it works only on 
> strings, not on unicode strings?

What gives you that impression? isspace works on Unicode strings too.

py> '   x'.isspace()
False
py> ''.isspace()
True

For the second case, you also need to check for empty strings, so you 
should use:

not s or s.isspace()

which will return True is s is empty or all whitespace, otherwise False.


-- 
Steven
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] unicode: alpha, whitespaces and digits

2013-12-29 Thread Dave Angel
On Sun, 29 Dec 2013 19:20:04 +, Mark Lawrence 
 wrote:

> 2. s is empty or contains only whitespaces


Call strip() on it. If it's now empty, it was whitespace.

--
DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] unicode: alpha, whitespaces and digits

2013-12-29 Thread Mark Lawrence

On 29/12/2013 13:36, Ulrich Goebel wrote:

Hallo,

I have a unicode string s, for example u"abc", u"äöü", u"123" or
something else, and I have to find out wether

1. s is not empty and contains only digits (as in u"123", but not in
u"3.1415")

or

2. s is empty or contains only whitespaces

For all other cases I would assume a "normal" unicode string in s,
whatever that may be.

For the first case it could be s.isdigit(), s.isnumeric() or
s.isdecimal() - but which one is the best?

For the second case it should be s.isspace(), but it works only on
strings, not on unicode strings?

Many thanks for any help!

Ulrich



This depends on whether you are using python 2 or 3.  In the latter all 
strings are unicode.  Please see 
http://docs.python.org/X/library/stdtypes.html#string-methods where X is 
2 or 3.  You might also want to look at 
http://docs.python.org/3.3/howto/unicode.html


--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor