buggy awk regex handling?

2012-08-02 Thread kaltheat


Hi,

I tried to replace three letters with three letters by awk using the 
sub-routine.
I assumed that my regular expression does mean the following:

match if three letters of any letter of alphabet occurs anywhere in input

$ echo AbC | awk '{sub(/[[:alpha:]]{3}/,cBa); print;}'
AbC

As you can see the result was unexpected.
When I try doing it for at least one letter, it works:

$ echo AbC | awk '{sub(/[[:alpha:]]+/,cBa); print;}'
cBa

Same problem without macro:

$ echo AbC | awk '{sub(/[A-Za-z]{3}/,cBa); print;}'
AbC

$ echo AbC | awk '{sub(/[A-Za-z]+/,cBa); print;}'
cBa

I thought that it might have something to do with the curly braces. But escaping
them doesn't do the trick.

What am I doing wrong?
Or is awk buggy?

Regards,
kaltheat

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: buggy awk regex handling?

2012-08-02 Thread RW
On Thu, 02 Aug 2012 13:20:52 +0200
kaltheat wrote:

 
 
 Hi,
 
 I tried to replace three letters with three letters by awk using the
 sub-routine. I assumed that my regular expression does mean the
 following:
 
 match if three letters of any letter of alphabet occurs anywhere in
 input
 
 $ echo AbC | awk '{sub(/[[:alpha:]]{3}/,cBa); print;}'
 AbC
 
 As you can see the result was unexpected.
 When I try doing it for at least one letter, it works:
 
 $ echo AbC | awk '{sub(/[[:alpha:]]+/,cBa); print;}'
 cBa
 ...
 What am I doing wrong?
 Or is awk buggy?

Traditional awk implementations don't support {n}, but I think POSIX
implementations should. 
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: buggy awk regex handling?

2012-08-02 Thread Warren Block

On Thu, 2 Aug 2012, RW wrote:


On Thu, 02 Aug 2012 13:20:52 +0200
kaltheat wrote:


I tried to replace three letters with three letters by awk using the
sub-routine. I assumed that my regular expression does mean the
following:

match if three letters of any letter of alphabet occurs anywhere in
input

$ echo AbC | awk '{sub(/[[:alpha:]]{3}/,cBa); print;}'
AbC

As you can see the result was unexpected.
When I try doing it for at least one letter, it works:

$ echo AbC | awk '{sub(/[[:alpha:]]+/,cBa); print;}'
cBa
...
What am I doing wrong?
Or is awk buggy?


Traditional awk implementations don't support {n}, but I think POSIX
implementations should.


Using gawk instead of awk agrees with that.  Printing the result of the 
sub (the number of substitutions performed) makes it a little more 
clear:


% echo AbC | awk '{print sub(/[[:alpha:]]{3}/,cBa); print;}'
0
AbC

% echo AbC | gawk '{print sub(/[[:alpha:]]{3}/,cBa); print;}'
1
cBa

sed can handle it:

% echo AbC | sed -E 's/[[:alpha:]]{3}/cBa/'
cBa
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org