Re: [Tutor] unicode: alpha, whitespaces and digits
On Sun, Dec 29, 2013 at 5:58 PM, Steven D'Aprano wrote: > If you want to test for something that a human reader will recognise as > a "whole number", s.isdigit() is probably the best one to use. isdigit() includes decimal digits plus other characters that have a digit value: >>> print u'\N{superscript two}' ² >>> unicodedata.digit(u'\N{superscript two}') 2 However, int(u'\N{superscript two}') raises a ValueError. int() only accepts the subset of digits that are isdecimal(). ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] unicode: alpha, whitespaces and digits
On Mon, Dec 30, 2013 at 09:58:10AM +1100, Steven D'Aprano wrote: > What gives you that impression? isspace works on Unicode strings too. > > py> ' x'.isspace() > False > py> ''.isspace() > True Oops, the above was copied and pasted from Python 3, which is why there are no u' prefixes. But it still holds in Python 2: py> u''.isspace() True py> u' x'.isspace() False Sorry for any confusion. -- Steven ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] unicode: alpha, whitespaces and digits
On Sun, Dec 29, 2013 at 02:36:32PM +0100, Ulrich Goebel wrote: > Hallo, > > I have a unicode string s, for example u"abc", u"äöü", u"123" or > something else, and I have to find out wether > > 1. s is not empty and contains only digits (as in u"123", but not in > u"3.1415") > > or > > 2. s is empty or contains only whitespaces > > For all other cases I would assume a "normal" unicode string in s, > whatever that may be. > > For the first case it could be s.isdigit(), s.isnumeric() or > s.isdecimal() - but which one is the best? Depends what you are trying to do. Only you can decide which is best. The three methods do slightly different things: - isdigit() tests for the digit characters 0...9, or their equivalent in whatever native language your computer is using. - isdecimal() tests for decimal characters. That includes the so-called "Arabic numerals" 0...9 (although the Arabs don't use them!) as well as other decimal digits like ٠١٢... (The above three are ARABIC-INDIC DIGIT ZERO through TWO.) - isnumeric() tests for characters which have the Unicode numeric value property. That includes decimal digits, as well as non-digit numbers such as ½ and ¾. If you want to test for something that a human reader will recognise as a "whole number", s.isdigit() is probably the best one to use. > For the second case it should be s.isspace(), but it works only on > strings, not on unicode strings? What gives you that impression? isspace works on Unicode strings too. py> ' x'.isspace() False py> ''.isspace() True For the second case, you also need to check for empty strings, so you should use: not s or s.isspace() which will return True is s is empty or all whitespace, otherwise False. -- Steven ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] unicode: alpha, whitespaces and digits
On Sun, 29 Dec 2013 19:20:04 +, Mark Lawrence wrote: > 2. s is empty or contains only whitespaces Call strip() on it. If it's now empty, it was whitespace. -- DaveA ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] unicode: alpha, whitespaces and digits
On 29/12/2013 13:36, Ulrich Goebel wrote: Hallo, I have a unicode string s, for example u"abc", u"äöü", u"123" or something else, and I have to find out wether 1. s is not empty and contains only digits (as in u"123", but not in u"3.1415") or 2. s is empty or contains only whitespaces For all other cases I would assume a "normal" unicode string in s, whatever that may be. For the first case it could be s.isdigit(), s.isnumeric() or s.isdecimal() - but which one is the best? For the second case it should be s.isspace(), but it works only on strings, not on unicode strings? Many thanks for any help! Ulrich This depends on whether you are using python 2 or 3. In the latter all strings are unicode. Please see http://docs.python.org/X/library/stdtypes.html#string-methods where X is 2 or 3. You might also want to look at http://docs.python.org/3.3/howto/unicode.html -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor