subject:"Re\: \[Tutor\] Regex not working as desired"

Re: [Tutor] Regex not working as desired

2018-03-06 Thread Alan Gauld via Tutor

On 06/03/18 22:17, Albert-Jan Roskam wrote:
> But the way you wrote it, the generator expression just "floats" 

Any expression can be used where a value is expected provided
that e3xpression produces a value of the required type.

A generator expression effectively produces a sequence and
the type of sequence is defined by the type of parentheses
used.

"123" -> a string
[1,2,3] -> a list
(1,2,3) -> a tuple
{1,2,3} -> a set

So when a function requires an iterable sequence you just
provide the expression(any expression) that results in an
iterable.

all(range(5))  # range 5 produces a range object which is iterable
all(n for n in [0,1,2,3,4])  # generator "equivalent" to the range

Similar things happen with tuples where the parens are actually
optional:

1,2,3   # a tuple of 3 numbers
(1,2,3) # the same tuple with parens to make it more obvious

HTH
-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Regex not working as desired

2018-03-06 Thread Steven D'Aprano

On Tue, Mar 06, 2018 at 10:17:20PM +, Albert-Jan Roskam wrote:

> > >>> all(c.isdigit() for c in '12c4')
> > False
> 
> I never understood why this is syntactically correct. It's like two 
> parentheses are missing.
> 
> This I understand:
> all((c.isdigit() for c in '12c4'))
> Or this:
> all([c.isdigit() for c in '12c4'])
> Or this:
> all((True, False))
> 
> But the way you wrote it, the generator expression just "floats" in 
> between the parentheses that are part of the all() function. Is this 
> something special about all() and any()? 

No, it is something special about generator expressions. The syntax for 
them is theoretically:

expression for x in iterable

but parentheses are required to make it unambiguous. If they are already 
inside parentheses, as in a function call:

   spam(expression for x in iterable)

the function call parens are sufficient to make it unambiguous and so 
there is no need to add an extra pair.

However, if you have two arguments, or some other compound expression, 
you need to use disambiguation parens:

spam(999, (expression for x in iterable))

-- 
Steve
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Regex not working as desired

2018-03-06 Thread Albert-Jan Roskam

On Feb 27, 2018 09:50, Alan Gauld via Tutor  wrote:
>
> On 27/02/18 05:13, Cameron Simpson wrote:
>
> > hard to debug when you do. That's not to say you shouldn't use them, but 
> > many
> > people use them for far too much.
>
>
> > Finally, you could also consider not using a regexp for this particular 
> > task.
> > Python's "int" class can be called with a string, and will raise an 
> > exception
>
> And, as another alternative, you can use all() with a
> generator expression:
>
> >>> all(c.isdigit() for c in '1234')
> True
> >>> all(c.isdigit() for c in '12c4')
> False

I never understood why this is syntactically correct. It's like two parentheses 
are missing.

This I understand:
all((c.isdigit() for c in '12c4'))
Or this:
all([c.isdigit() for c in '12c4'])
Or this:
all((True, False))

But the way you wrote it, the generator expression just "floats" in between the 
parentheses that are part of the all() function. Is this something special 
about all() and any()?
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Regex not working as desired

2018-02-27 Thread Alan Gauld via Tutor

On 27/02/18 09:50, Peter Otten wrote:
>> def all_digits(s):
>> return all(c.isdigit() for c in s)
>  
> Note that isdigit() already checks all characters in the string:

Ah! I should have known that but forgot.
I think the singular name confused me.

> The only difference to your suggestion is how it handles the empty string:
> 
 def all_digits(s):
> ... return all(c.isdigit() for c in s)
> ... 
 all_digits("")
> True
 "".isdigit()
> False

Interesting, I'd have expected all() to return
False for an empty sequence... But looking at help(all)
it clearly states that it returns True. RTFM! :-(

However, in practice, and for this specific case, the
try/except route is probably best, I just wanted to
point out that there were other (concise) ways to
avoid a regex.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Regex not working as desired

2018-02-27 Thread Peter Otten

Alan Gauld via Tutor wrote:

> On 27/02/18 05:13, Cameron Simpson wrote:
> 
>> hard to debug when you do. That's not to say you shouldn't use them, but
>> many people use them for far too much.
> 
> 
>> Finally, you could also consider not using a regexp for this particular
>> task. Python's "int" class can be called with a string, and will raise an
>> exception
> 
> And, as another alternative, you can use all() with a
> generator expression:
> 
 all(c.isdigit() for c in '1234')
> True
 all(c.isdigit() for c in '12c4')
> False

> 
> Or generally:
> 
> def all_digits(s):
> return all(c.isdigit() for c in s)
 
Note that isdigit() already checks all characters in the string:

>>> "123".isdigit()
True
>>> "1a1".isdigit()
False

The only difference to your suggestion is how it handles the empty string:

>>> def all_digits(s):
... return all(c.isdigit() for c in s)
... 
>>> all_digits("")
True
>>> "".isdigit()
False

A potential problem of str.isdigit() -- and int() -- may be its unicode 
awareness:

>>> s = "\N{CHAM DIGIT ONE}\N{CHAM DIGIT TWO}\N{CHAM DIGIT THREE}"
>>> s
'꩑꩒꩓'
>>> s.isdigit()
True
>>> int(s)
123


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Regex not working as desired

2018-02-27 Thread Alan Gauld via Tutor

On 27/02/18 05:13, Cameron Simpson wrote:

> hard to debug when you do. That's not to say you shouldn't use them, but many 
> people use them for far too much.


> Finally, you could also consider not using a regexp for this particular task. 
>  
> Python's "int" class can be called with a string, and will raise an exception 

And, as another alternative, you can use all() with a
generator expression:

>>> all(c.isdigit() for c in '1234')
True
>>> all(c.isdigit() for c in '12c4')
False
>>>

Or generally:

def all_digits(s):
return all(c.isdigit() for c in s)

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Regex not working as desired

2018-02-27 Thread Steven D'Aprano

On Mon, Feb 26, 2018 at 11:01:49AM -0800, Roger Lea Scherer wrote:
>   The first step is to input data and then I want to check to make sure
> there are only digits and no other type of characters. I thought regex
> would be great for this.

I'm going to quote Jamie Zawinski:

Some people, when confronted with a problem, think "I know, 
I'll use regular expressions." Now they have two problems.

Welcome to the club of people who discovered that regexes are just as 
likely to make things worse as better :-(

Here's another, simpler way to check for all digits:

value = '12345'  # for example
value.isdigit()

The isdigit() method will return True if value contains nothing but 
digits (or the empty string), and False otherwise.

Sounds like just what you want, right? Nope. It *seems* good right up to 
the moment you enter a negative number:

py> '-123'.isdigit()
False

Or you want a number including a decimal point. Floating point numbers 
are *especially* tricky to test for, as you have to include:

# mantissa
optional + or - sign
zero or more digits
optional decimal point (but no more than one!)
zero or more digits
but at least one digit either before or after the decimal point;
# optional exponent
E or e
optional + or - sign
one or more digits

It is hard to write a regex to match floats.

Which brings us to a better tactic for ensuring that values are a valid 
int or float: try it and see!

Instead of using the Look Before You Leap tactic:

if string looks like an int:
number = int(string)  # hope this works, if not, we're in trouble!
else:
handle the invalid input

we can use the "Easier To Ask For Forgiveness Than Permission" tactic, 
and just *try* converting it, and deal with it if it fails:

try:
number = int(string)
except ValueError:
handle the invalid input

The same applies for floats, of course.

Now, one great benefit of this is that the interpreter already knows 
what makes a proper int (or float), and *you don't have to care*. Let 
the interpreter deal with it, and only if it fails do you have to deal 
with the invalid string.

By the way: absolute *none* of the turtle graphics code is the least bit 
relevant to your question, and we don't need to see it all. That's a bit 
like going to the supermarket to return a can of beans that you bought 
because they had gone off:

"Hi, I bought this can of beans yesterday, but when I got it home and 
opened it, they were all mouldy and green inside. Here's my receipt, 
and the can, and here's the can opener I used to open them, and the bowl 
I was going to put the beans into, and the microwave oven I would have 
used to heat them up, and the spoon for stirring them, and the toast I had 
made to put the beans on, and the salt and pepper shakers I use."

:-)

-- 
Steve
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Regex not working as desired

2018-02-27 Thread Terry Carroll


On Mon, 26 Feb 2018, Terry Carroll wrote:


Instead of looking fo re xcaprions..


Wow. That should read "Instead of looking for exceptions..." Something 
really got away from me there.


--
Terry Carroll
carr...@tjc.com
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Regex not working as desired

2018-02-27 Thread Terry Carroll


On Mon, 26 Feb 2018, Roger Lea Scherer wrote:


""" ensure input is no other characters than digits
sudocode: if the input has anything other than digits
return digits  """

 

p = re.compile(r'[^\D]')


I'm not so great at regular expressions, but this regex appears to be 
searching for a string that matches anything in the class start-of-string 
of non-digit.


 "[...]" says, look for anything in this set of characters; and you have 
two things:

 ^ : start-of-string
 \D : any non-digit

Instead of looking fo re xcaprions, I would look for what you *do* want. 
this regex should do it for you:


  r'^\d+$'

This is looking for a start-of-string ("^"); then a digit ("\d") that 
occurs at least once (the "+" qualifier); then an end-of string ("$").


In other words, one or more digits, with nothing else before or after.

Here's a simple looping test to get you started (ignore the "from 
__future__" line; I'm running Python 2):


from __future__ import print_function
import re
p = re.compile(r'^\d+$')
test_data = ["4jkk33", "4k33", "4jjk4", "4334", "4","44", "444", ""]
for thing in test_data:
m = p.match(thing)
if m is None:
print("not all digits:", thing)
else:
print("all digits:", thing)


--
Terry Carroll
carr...@tjc.com

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Regex not working as desired

2018-02-26 Thread Cameron Simpson


On 26Feb2018 11:01, Roger Lea Scherer  wrote:

 The first step is to input data and then I want to check to make sure
there are only digits and no other type of characters. I thought regex
would be great for this.


Many people do :-) They are a reasonable tool for an assortment of text 
matching tasks, but as you're discovering they can be easy to get wrong and 
hard to debug when you do. That's not to say you shouldn't use them, but many 
people use them for far too much.



The program works great, but no matter what I
enter, the regex part does the same thing. By same thing I mean this:

[...]

Please enter an integer less than 10,000 greater than 0:  4jkk33
No match
Please enter an integer less than 10,000 greater than 0:  4k33
No match
Please enter an integer less than 10,000 greater than 0:  4jjk4
No match
Please enter an integer less than 10,000 greater than 0:  4334
No match


So, "no match regardless of the input".


So I don't know what I'm doing wrong. The cipher will still draw, but I
want to return an "error message" in this case print("No match"), but it
does it every time, even when there are only digits; that's not what I
want. Please help. Below is my code:


Thank you for the code! Many people forget to include it. I'm going to trim for 
readability...


[...]

digits = input("Please enter an integer less than 10,000 greater than 0:  ")

""" ensure input is no other characters than digits
sudocode: if the input has anything other than digits
return digits  """

#def digit_check(digits):
# I thought making it a function might h
p = re.compile(r'[^\D]')


This seems a slightly obtuse way to match a digit. You're matching "not a 
nondigit". You could just use \d to match a digit, which is more readable.


This regular expression also matches a _single_ digit.


m = p.match(digits)


Note that match() matches at the beginning of the string.

I notice that all your test strings start with a digit. That is why the regular 
expression always matches.



if m:
   print("No match")


This seems upside down, since your expression matches a digit.

Ah, I see what you've done.

The "^" marker has 2 purposes in regular expressions. At the start of a regular 
expression it requires the expression to match at the start of the string. At 
the start of a character range inside [] it means to invert the range. So:


 \dA digit.
 \DA nondigit.
 ^\D   A nondigit at the start of the string
 [^\D] "not a nondigit" ==> a digit

The other thing that you may have missed is that the \d, \D etc shortcuts for 
various common characters do not need to be inside [] markers.


So I suspect you wanted to at least start with "a nondigit at the start of the 
string". That would be:


 ^\D

with no [] characters.

Now your wider problem seems to be to make sure your string consists entirely 
of digits. Since your logic looks like a match for invalid input, your regexp 
might look like this:


 \D

and you could use .search instead of .match to find the nondigit anywhere in 
the string instead of just at the start.


Usually, however, it is better to write validation code which matches exactly 
what you actually want instead of trying to think of all the things that might 
be invalid. You want an "all digits" string, so you might write this:


 ^\d*$

which matches a string containing only digits from the beginning to the end.  
That's:


 ^ start of string
 \da digit
 * zero or more of the digit
 $ end of string

Of course you really want at least one or more, so you would use "+" instead of 
"*".


So you code might look like:

 valid_regexp = re.compile(r'^\d+$')
 m = valid_regexp.match(digits)
 if m:
   # input is valid
 else:
   # input is invalid

Finally, you could also consider not using a regexp for this particular task.  
Python's "int" class can be called with a string, and will raise an exception 
if that string is not a valid integer. This also has the advantage that you get 
an int back, which is easy to test for your other constraints (less than 1, 
greater than 0). Now, because int(0 raises an exception for bad input you need 
to phrase the test differently:


 try:
   value = int(digits)
 except ValueError:
   # invalid input, do something here
 else:
   if value >= 1 or value <= 0:
 # value out of range, do something here
   else:
 # valid input, use it

Cheers,
Cameron Simpson  (formerly c...@zip.com.au)
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Regex not working as desired

Re: [Tutor] Regex not working as desired

Re: [Tutor] Regex not working as desired

Re: [Tutor] Regex not working as desired

Re: [Tutor] Regex not working as desired

Re: [Tutor] Regex not working as desired

Re: [Tutor] Regex not working as desired

Re: [Tutor] Regex not working as desired

Re: [Tutor] Regex not working as desired

Re: [Tutor] Regex not working as desired

10 matches

Site Navigation

Mail list logo

Footer information