>: Indeed. I wrote the following test:
>: 
>: Pattern p = Pattern.compile("(.*)");
>: Matcher m = p.matcher("xyz");
>: Assert.assertEquals("", "Video", m.replaceAll("Video"));
>: 
>: The test fails. It gives "VideoVideo" as the actual result. I guess there is
>: something about Matcher.replaceAll that I don't know. Off to read the
>: javadocs then.
>
>".*" matches the empty string (for that matter any regex clause with the 
>"*" modifier applied matches the empty string), and iterating over pattern 
>matches (ie: what happens if you call Matcher.find() or 
>Matcher.replaceAll()) always advances to "first character not matched by 
>[the previous] match." (ie: let prev = m.end(); if (m.find) then prev <= 
>m.start()).
>
>So ".*" always matches twice on any given String x ... once when it 
>matches from 0 to x.length()-1, and one when it matches the empty string 
>starting and ending at x.length()-1.
>
>That's why using "^.*" doesn't have this problem ... "*" is greedy so it 
>only matches once at the start of the string and then there can't be any 
>more matches.  Conversly: ".*$" and ".*\z" will still have this problem, 
>because any number of matches can have the same ending offset.
>
>
>-Hoss

Hmmm, given the chance perl behaves the same. Although attempting
to use  /*/ fails. Another lesson learnt!

#! /usr/local/bin/perl
use strict;
my($s)="cat mat rat hat";
my($c)=0;

print " a-match", ++$c, "='$1'\n" while( $s =~ m/(at)/g ); 
$c=0;
print " b-match", ++$c, "='$1'\n" while( $s =~ m/(.*)/g );
$c=0;
print " c-match", ++$c, "='$1'\n" while( $s =~ m/^(.*)/g );
$c=0;
print " d-match", ++$c, "='$1'\n" while( $s =~ m/(.*)$/g );

-- 

===============================================================
Fergus McMenemie               Email:fer...@twig.me.uk
Techmore Ltd                   Phone:(UK) 07721 376021

Unix/Mac/Intranets             Analyst Programmer
===============================================================

Reply via email to