Re: Regex to match odd numbers

2014-08-23 Thread Abigail
On Sat, Aug 23, 2014 at 04:43:06PM +0100, Aaron Crane wrote:
> Paul Makepeace  wrote:
> > Does anyone have any concrete examples where the locale affecting
> > meaning/matching of \d causes real problems?
> 
> In my experience, it's not necessarily the locale so much as Unicode
> characters that the programmer wasn't expecting, which then cause
> surprising behaviour. For example, this looks more-or-less sensible:
> 
> say $arg + 17 if $arg =~ /\A\d+\z/
> 
> but if $arg is a digit other than 0..9, Perl will treat it as 0 and
> emit a warning. (Which is particularly problematic if you also have
> fatal warnings enabled.)


Using [0-9] is much safer, but there are still gotchas. I've been
bitten by:

my ($value) = $str =~ /([0-9]+)/;
if ($value) {  # Don't divide by 0
$something /= $value;
}

My program crashed because it was dividing by zero... /[0-9]+/ matches
"00", which is true, but numerically still 0.



Abigail


Re: Regex to match odd numbers

2014-08-23 Thread Aaron Crane
Paul Makepeace  wrote:
> Does anyone have any concrete examples where the locale affecting
> meaning/matching of \d causes real problems?

In my experience, it's not necessarily the locale so much as Unicode
characters that the programmer wasn't expecting, which then cause
surprising behaviour. For example, this looks more-or-less sensible:

say $arg + 17 if $arg =~ /\A\d+\z/

but if $arg is a digit other than 0..9, Perl will treat it as 0 and
emit a warning. (Which is particularly problematic if you also have
fatal warnings enabled.)

> I'm assuming the worst case is it matches too much, e.g. picks up
> spurious Chinese numerals, which seems like a wildly improbable edge
> case for most datasets+patterns.

"Improbable" sounds reasonable, but bear in mind that people often use
regexes containing things like \d and \w for validating input from
untrusted sources, so there's scope for significant brokenness there.

> Presumably there isn't a situation
> where \d _doesn't_ match [0-9] at least? In other words [0-9] is a
> subset of \d for all locales.

For all *sane* locales, sure. :-)

One of the many unpleasant things about locales is that you never
really know what you're going to get — and there's no shortage of OSes
with broken locale definitions.

> $ export LC_CTYPE=zh_CN.utf-8
> $ perl -Mlocale -Mutf8 -le 'print "一" =~ /\d/'  # 1
>
> Doesn't print 1 - why?

I don't know what the expected behaviour is for the zh_CN.utf-8
locale, but that behaviour doesn't surprise me for Unicode: the hanzi
numerals don't have the Unicode "numeric" property. More specifically,
their general category is Lo ("other letter"), rather than (say) Nd
("decimal digit"):

$ perl -MUnicode::UCD=charinfo -E \
> 'say charinfo($_)->{category}, " ", chr =~ /\d/u for 0x4e00, 0x661'
Lo
Nd 1

(U+0661 is ARABIC-INDIC DIGIT ONE.)

> $ export LC_CTYPE=en_US.utf-8
> $ perl -Mlocale -Mutf8 -le 'print "三" =~ /[一-六]/'
> 1
>
> Why is it still 1?

That's because /[一-六]/ matches the set of characters whose codepoints
are in the range 0x4E00 through 0x516D (regardless of locale), and 三
is U+4E09 (which is in that range). Adding 'use re "debug"' to your
program reveals more information about what's going on there:

$ perl -Mlocale -Mutf8 -le 'use re "debug"; print "三" =~ /[一-六]/'
Compiling REx "[%x{4e00}-%x{516d}]"
Final program:
   1: ANYOF{loc}[{unicode}4E00-516D] (12)
  12: END (0)
stclass ANYOF{loc}[{unicode}4E00-516D] minlen 1
Matching REx "[%x{4e00}-%x{516d}]" against "%x{4e09}"
UTF-8 pattern and string...
Matching stclass ANYOF{loc}[{unicode}4E00-516D] against "%x{4e09}" (3 bytes)
   0 <> <%x{4e09}>   |  1:ANYOF{loc}[{unicode}4E00-516D](12)
   3 <%x{4e09}> <>   | 12:END(0)
Match successful!
1
Freeing REx: "[%x{4e00}-%x{516d}]"

-- 
Aaron Crane ** http://aaroncrane.co.uk/



Re: Regex to match odd numbers

2014-08-22 Thread Paul Makepeace
$thread->resurrect();


On Tue, May 27, 2014 at 12:37 PM, Mark Fowler  wrote:
>
> On Tuesday, May 27, 2014, Sam Kington  wrote:
> >
> > Sounds like you want something like
> >
> > / ( ^ 5[.] ( [79] | \d+ [13579] ) ) /x
> >
>
> This is where I mention that \d matches characters other than [0-9] unless
> you have the /a flag in effect (thanks Unicode!)

Does anyone have any concrete examples where the locale affecting
meaning/matching of \d causes real problems?

I'm assuming the worst case is it matches too much, e.g. picks up
spurious Chinese numerals, which seems like a wildly improbable edge
case for most datasets+patterns. Presumably there isn't a situation
where \d _doesn't_ match [0-9] at least? In other words [0-9] is a
subset of \d for all locales.

$ export LC_CTYPE=zh_CN.utf-8
$ perl -Mlocale -Mutf8 -le 'print "一" =~ /\d/'  # 1

Doesn't print 1 - why?

$ export LC_CTYPE=zh_CN.utf-8
$ perl -Mlocale -Mutf8 -le 'print "三" =~ /[一-六]/'  # 3 in 1-6? Yes
1
$ export LC_CTYPE=en_US.utf-8
$ perl -Mlocale -Mutf8 -le 'print "三" =~ /[一-六]/'
1

Why is it still 1? OS X with Perl 5.16.2

Paul



Re: Regex to match odd numbers

2014-05-28 Thread David Cantrell
On Tue, May 27, 2014 at 03:37:51PM -0400, Mark Fowler wrote:
> On Tuesday, May 27, 2014, Sam Kington  wrote:
> > Sounds like you want something like
> >   / ( ^ 5[.] ( [79] | \d+ [13579] ) ) /x
> This is where I mention that \d matches characters other than [0-9] unless
> you have the /a flag in effect (thanks Unicode!)

I'm sure that if the p5p cabal want to spoil peoples' day, they have far
more entertaining things they can do than to release perl version 5.2Ù¥.

-- 
David Cantrell | A machine for turning tea into grumpiness

  Irregular English:
ladies glow; gentlemen perspire; brutes, oafs and athletes sweat


Re: Regex to match odd numbers

2014-05-28 Thread David Cantrell
On Tue, May 27, 2014 at 05:08:16PM +0100, Jasper wrote:

> Something like the prime regex would work:
> ('1' x $foo) =~ /^(11)*1$/

Ahhh, the Right Sort Of Mad :-)

-- 
David Cantrell | Nth greatest programmer in the world

  Irregular English:
you have anecdotes; they have data; I have proof


Re: Regex to match odd numbers

2014-05-28 Thread David Cantrell
On Tue, May 27, 2014 at 04:42:40PM +0100, Alex Balhatchet wrote:

> Odd numbers end in [13579] so... m{5 [.] \d* [13579]}gmsx ... ?

Oh yeah, so embarrassingly obvious. The solution is to *remember that
version "numbers" aren't numbers, they're text*.

-- 
David Cantrell | London Perl Mongers Deputy Chief Heretic

Perl: the only language that makes Welsh look acceptable


Re: Regex to match odd numbers

2014-05-27 Thread Mark Fowler
On Tuesday, May 27, 2014, Sam Kington  wrote:
>
> Sounds like you want something like
>
> / ( ^ 5[.] ( [79] | \d+ [13579] ) ) /x
>

This is where I mention that \d matches characters other than [0-9] unless
you have the /a flag in effect (thanks Unicode!)

Mark


Re: Regex to match odd numbers

2014-05-27 Thread Dirk Koopman

On 27/05/14 17:21, Abigail wrote:

On Tue, May 27, 2014 at 04:54:47PM +0100, Dirk Koopman wrote:

On 27/05/14 16:22, David Cantrell wrote:

As part of the nasty mess that is CPANdeps, I have this line of code:

$record->{is_dev_perl} = (
$record->{perl} =~ /(^5\.(7|9|11|13|15|17|19|21)|rc|patch)/i
) ? 1 : 0;

I'd like to not have to remember to add 23 to the list in a year or so's
time. Can anyone think of a nice way of matching any odd number from 7
upwards?  Obviously it's easy to do in a coupla lines of perl code
instead of a regex, so I'm asking more out of curiosity than because I
actually need it.



$bool = ($record->{perl} > 7) & 1; # for example?



   $ perl -wE 'say +(8 > 7) & 1'
   1
   $


It would be very odd to consider 8 to be odd.



Indeed, hence my follow up.

But this works:

perl -e 'for (0..30) {print ((($_ > 7) && ($_ & 1)) ? "$_ = 1\n" : "$_ = 
0\n")}';


Dirk


Re: Regex to match odd numbers

2014-05-27 Thread Alex Balhatchet
And obviously I answered before letting your email sink in... "odd
number from 7 upwards"! But still, my solution should just about work
if you change \d* to \d+ and do the 9 manually :-)

- Alex

On 27 May 2014 16:42, Alex Balhatchet  wrote:
> Odd numbers end in [13579] so... m{5 [.] \d* [13579]}gmsx ... ?
>
> - Alex


Re: Regex to match odd numbers

2014-05-27 Thread Alex Balhatchet
Odd numbers end in [13579] so... m{5 [.] \d* [13579]}gmsx ... ?

- Alex


Re: Regex to match odd numbers

2014-05-27 Thread Abigail
On Tue, May 27, 2014 at 04:54:47PM +0100, Dirk Koopman wrote:
> On 27/05/14 16:22, David Cantrell wrote:
>> As part of the nasty mess that is CPANdeps, I have this line of code:
>>
>> $record->{is_dev_perl} = (
>>$record->{perl} =~ /(^5\.(7|9|11|13|15|17|19|21)|rc|patch)/i
>> ) ? 1 : 0;
>>
>> I'd like to not have to remember to add 23 to the list in a year or so's
>> time. Can anyone think of a nice way of matching any odd number from 7
>> upwards?  Obviously it's easy to do in a coupla lines of perl code
>> instead of a regex, so I'm asking more out of curiosity than because I
>> actually need it.
>>
>
> $bool = ($record->{perl} > 7) & 1; # for example?
>

  $ perl -wE 'say +(8 > 7) & 1'
  1
  $


It would be very odd to consider 8 to be odd.


Abigail


Re: Regex to match odd numbers

2014-05-27 Thread Jasper
I'm furiously trying to think of an even stupider way of doing this...


On 27 May 2014 17:09, Jasper  wrote:

> From 7 upwards is
>
> ('1' x $foo) =~ /^(11){3,}1$/
>
> or something, I'm not even bothering to test this :S
>
>
> On 27 May 2014 17:08, Jasper  wrote:
>
>> Something like the prime regex would work:
>>
>> ('1' x $foo) =~ /^(11)*1$/
>>
>>
>> On 27 May 2014 16:43, Joel Bernstein  wrote:
>>
>>> Surely you only need to examine the right-most digit to know if the
>>> number
>>> is odd? Your special requirement to (AIUI) consider 0..7 as even isn't
>>> difficult to add.
>>>
>>> /joel
>>>
>>>
>>> On 27 May 2014 16:22, David Cantrell  wrote:
>>>
>>> > As part of the nasty mess that is CPANdeps, I have this line of code:
>>> >
>>> > $record->{is_dev_perl} = (
>>> >   $record->{perl} =~ /(^5\.(7|9|11|13|15|17|19|21)|rc|patch)/i
>>> > ) ? 1 : 0;
>>> >
>>> > I'd like to not have to remember to add 23 to the list in a year or
>>> so's
>>> > time. Can anyone think of a nice way of matching any odd number from 7
>>> > upwards?  Obviously it's easy to do in a coupla lines of perl code
>>> > instead of a regex, so I'm asking more out of curiosity than because I
>>> > actually need it.
>>> >
>>> > --
>>> > David Cantrell | Official London Perl Mongers Bad Influence
>>> >
>>> > Today's previously unreported paraphilia is tomorrow's Internet
>>> sensation
>>> >
>>> >
>>>
>>
>>
>>
>> --
>> Jasper
>>
>
>
>
> --
> Jasper
>



-- 
Jasper


Re: Regex to match odd numbers

2014-05-27 Thread Jasper
>From 7 upwards is

('1' x $foo) =~ /^(11){3,}1$/

or something, I'm not even bothering to test this :S


On 27 May 2014 17:08, Jasper  wrote:

> Something like the prime regex would work:
>
> ('1' x $foo) =~ /^(11)*1$/
>
>
> On 27 May 2014 16:43, Joel Bernstein  wrote:
>
>> Surely you only need to examine the right-most digit to know if the number
>> is odd? Your special requirement to (AIUI) consider 0..7 as even isn't
>> difficult to add.
>>
>> /joel
>>
>>
>> On 27 May 2014 16:22, David Cantrell  wrote:
>>
>> > As part of the nasty mess that is CPANdeps, I have this line of code:
>> >
>> > $record->{is_dev_perl} = (
>> >   $record->{perl} =~ /(^5\.(7|9|11|13|15|17|19|21)|rc|patch)/i
>> > ) ? 1 : 0;
>> >
>> > I'd like to not have to remember to add 23 to the list in a year or so's
>> > time. Can anyone think of a nice way of matching any odd number from 7
>> > upwards?  Obviously it's easy to do in a coupla lines of perl code
>> > instead of a regex, so I'm asking more out of curiosity than because I
>> > actually need it.
>> >
>> > --
>> > David Cantrell | Official London Perl Mongers Bad Influence
>> >
>> > Today's previously unreported paraphilia is tomorrow's Internet
>> sensation
>> >
>> >
>>
>
>
>
> --
> Jasper
>



-- 
Jasper


Re: Regex to match odd numbers

2014-05-27 Thread Jasper
Something like the prime regex would work:

('1' x $foo) =~ /^(11)*1$/


On 27 May 2014 16:43, Joel Bernstein  wrote:

> Surely you only need to examine the right-most digit to know if the number
> is odd? Your special requirement to (AIUI) consider 0..7 as even isn't
> difficult to add.
>
> /joel
>
>
> On 27 May 2014 16:22, David Cantrell  wrote:
>
> > As part of the nasty mess that is CPANdeps, I have this line of code:
> >
> > $record->{is_dev_perl} = (
> >   $record->{perl} =~ /(^5\.(7|9|11|13|15|17|19|21)|rc|patch)/i
> > ) ? 1 : 0;
> >
> > I'd like to not have to remember to add 23 to the list in a year or so's
> > time. Can anyone think of a nice way of matching any odd number from 7
> > upwards?  Obviously it's easy to do in a coupla lines of perl code
> > instead of a regex, so I'm asking more out of curiosity than because I
> > actually need it.
> >
> > --
> > David Cantrell | Official London Perl Mongers Bad Influence
> >
> > Today's previously unreported paraphilia is tomorrow's Internet sensation
> >
> >
>



-- 
Jasper


Re: Regex to match odd numbers

2014-05-27 Thread Sam Kington
> As part of the nasty mess that is CPANdeps, I have this line of code:
> 
> $record->{is_dev_perl} = (
>  $record->{perl} =~ /(^5\.(7|9|11|13|15|17|19|21)|rc|patch)/i
> ) ? 1 : 0;
> 
> I'd like to not have to remember to add 23 to the list in a year or so's
> time. Can anyone think of a nice way of matching any odd number from 7
> upwards?  Obviously it's easy to do in a coupla lines of perl code
> instead of a regex, so I'm asking more out of curiosity than because I
> actually need it.

Sounds like you want something like

/ ( ^ 5[.] ( [79] | \d+ [13579] ) ) /x

Sam
-- 
Website: http://www.illuminated.co.uk/



Re: Regex to match odd numbers

2014-05-27 Thread Abigail
On Tue, May 27, 2014 at 04:22:07PM +0100, David Cantrell wrote:
> As part of the nasty mess that is CPANdeps, I have this line of code:
> 
> $record->{is_dev_perl} = (
>   $record->{perl} =~ /(^5\.(7|9|11|13|15|17|19|21)|rc|patch)/i
> ) ? 1 : 0;
> 
> I'd like to not have to remember to add 23 to the list in a year or so's
> time. Can anyone think of a nice way of matching any odd number from 7
> upwards?  Obviously it's easy to do in a coupla lines of perl code
> instead of a regex, so I'm asking more out of curiosity than because I
> actually need it.



You mean, 7, 9, or any number that requires at least two digits where
the last one is odd?

I'd write that as:

/^0*(?:7|9|[1-9][0-9]*[13579])$/


allowing for leading 0s.



Abigail


Re: Regex to match odd numbers

2014-05-27 Thread Dirk Koopman

On 27/05/14 16:54, Dirk Koopman wrote:


$bool = ($record->{perl} > 7) & 1; # for example?



Clearly, I getting too old for this. Or my prevarication level too high...



Re: Regex to match odd numbers

2014-05-27 Thread Gareth Harper
Apologies, didn't see the 7 upward (and missed the 5\.) .  This can almost
certainly look nicer or smaller with a negative assertion or something.
However this should do the job.

$record->{perl} =~ /(^5\.(7|9|[1-9][0-9]*[13579])|rc|patch)/i






On 27 May 2014 16:53, Gareth Harper  wrote:

> $record->{perl} =~ /(^[0-9]*[13579]|rc|patch))/i
>
> ?
>
>
> On 27 May 2014 16:22, David Cantrell  wrote:
>
>> As part of the nasty mess that is CPANdeps, I have this line of code:
>>
>> $record->{is_dev_perl} = (
>>   $record->{perl} =~ /(^5\.(7|9|11|13|15|17|19|21)|rc|patch)/i
>> ) ? 1 : 0;
>>
>> I'd like to not have to remember to add 23 to the list in a year or so's
>> time. Can anyone think of a nice way of matching any odd number from 7
>> upwards?  Obviously it's easy to do in a coupla lines of perl code
>> instead of a regex, so I'm asking more out of curiosity than because I
>> actually need it.
>>
>> --
>> David Cantrell | Official London Perl Mongers Bad Influence
>>
>> Today's previously unreported paraphilia is tomorrow's Internet sensation
>>
>
>


Re: Regex to match odd numbers

2014-05-27 Thread Mike Whitaker

On 27 May 2014, at 16:22, David Cantrell  wrote:

> As part of the nasty mess that is CPANdeps, I have this line of code:
> 
> $record->{is_dev_perl} = (
> $record->{perl} =~ /(^5\.(7|9|11|13|15|17|19|21)|rc|patch)/i
> ) ? 1 : 0;
> 
> I'd like to not have to remember to add 23 to the list in a year or so's
> time. Can anyone think of a nice way of matching any odd number from 7
> upwards?

I may be sufficiently uncaffinated this late on a logical Monday to be missing 
something, but how about:

$record->{perl} =~ /(^5\.(7|9|(\d+[13579])|rc|patch)/i

which should be good for a while




Re: Regex to match odd numbers

2014-05-27 Thread Dirk Koopman

On 27/05/14 16:22, David Cantrell wrote:

As part of the nasty mess that is CPANdeps, I have this line of code:

$record->{is_dev_perl} = (
   $record->{perl} =~ /(^5\.(7|9|11|13|15|17|19|21)|rc|patch)/i
) ? 1 : 0;

I'd like to not have to remember to add 23 to the list in a year or so's
time. Can anyone think of a nice way of matching any odd number from 7
upwards?  Obviously it's easy to do in a coupla lines of perl code
instead of a regex, so I'm asking more out of curiosity than because I
actually need it.



$bool = ($record->{perl} > 7) & 1; # for example?




Re: Regex to match odd numbers

2014-05-27 Thread Gareth Harper
$record->{perl} =~ /(^[0-9]*[13579]|rc|patch))/i

?


On 27 May 2014 16:22, David Cantrell  wrote:

> As part of the nasty mess that is CPANdeps, I have this line of code:
>
> $record->{is_dev_perl} = (
>   $record->{perl} =~ /(^5\.(7|9|11|13|15|17|19|21)|rc|patch)/i
> ) ? 1 : 0;
>
> I'd like to not have to remember to add 23 to the list in a year or so's
> time. Can anyone think of a nice way of matching any odd number from 7
> upwards?  Obviously it's easy to do in a coupla lines of perl code
> instead of a regex, so I'm asking more out of curiosity than because I
> actually need it.
>
> --
> David Cantrell | Official London Perl Mongers Bad Influence
>
> Today's previously unreported paraphilia is tomorrow's Internet sensation
>


Re: Regex to match odd numbers

2014-05-27 Thread Abigail
On Tue, May 27, 2014 at 04:22:07PM +0100, David Cantrell wrote:
> As part of the nasty mess that is CPANdeps, I have this line of code:
> 
> $record->{is_dev_perl} = (
>   $record->{perl} =~ /(^5\.(7|9|11|13|15|17|19|21)|rc|patch)/i
> ) ? 1 : 0;
> 
> I'd like to not have to remember to add 23 to the list in a year or so's
> time. Can anyone think of a nice way of matching any odd number from 7
> upwards?  Obviously it's easy to do in a coupla lines of perl code
> instead of a regex, so I'm asking more out of curiosity than because I
> actually need it.


You mean, 7, 9, or any number using more than two digits, ending in an
odd one? That doesn't seem to be hard to write.


/^0*(?:7|9|[1-9][0-9]*[13579])$/



Abigail


Re: Regex to match odd numbers

2014-05-27 Thread Joel Bernstein
Surely you only need to examine the right-most digit to know if the number
is odd? Your special requirement to (AIUI) consider 0..7 as even isn't
difficult to add.

/joel


On 27 May 2014 16:22, David Cantrell  wrote:

> As part of the nasty mess that is CPANdeps, I have this line of code:
>
> $record->{is_dev_perl} = (
>   $record->{perl} =~ /(^5\.(7|9|11|13|15|17|19|21)|rc|patch)/i
> ) ? 1 : 0;
>
> I'd like to not have to remember to add 23 to the list in a year or so's
> time. Can anyone think of a nice way of matching any odd number from 7
> upwards?  Obviously it's easy to do in a coupla lines of perl code
> instead of a regex, so I'm asking more out of curiosity than because I
> actually need it.
>
> --
> David Cantrell | Official London Perl Mongers Bad Influence
>
> Today's previously unreported paraphilia is tomorrow's Internet sensation
>
>


Regex to match odd numbers

2014-05-27 Thread David Cantrell
As part of the nasty mess that is CPANdeps, I have this line of code:

$record->{is_dev_perl} = (
  $record->{perl} =~ /(^5\.(7|9|11|13|15|17|19|21)|rc|patch)/i
) ? 1 : 0;

I'd like to not have to remember to add 23 to the list in a year or so's
time. Can anyone think of a nice way of matching any odd number from 7
upwards?  Obviously it's easy to do in a coupla lines of perl code
instead of a regex, so I'm asking more out of curiosity than because I
actually need it.

-- 
David Cantrell | Official London Perl Mongers Bad Influence

Today's previously unreported paraphilia is tomorrow's Internet sensation