>: Indeed. I wrote the following test: >: >: Pattern p = Pattern.compile("(.*)"); >: Matcher m = p.matcher("xyz"); >: Assert.assertEquals("", "Video", m.replaceAll("Video")); >: >: The test fails. It gives "VideoVideo" as the actual result. I guess there is >: something about Matcher.replaceAll that I don't know. Off to read the >: javadocs then. > >".*" matches the empty string (for that matter any regex clause with the >"*" modifier applied matches the empty string), and iterating over pattern >matches (ie: what happens if you call Matcher.find() or >Matcher.replaceAll()) always advances to "first character not matched by >[the previous] match." (ie: let prev = m.end(); if (m.find) then prev <= >m.start()). > >So ".*" always matches twice on any given String x ... once when it >matches from 0 to x.length()-1, and one when it matches the empty string >starting and ending at x.length()-1. > >That's why using "^.*" doesn't have this problem ... "*" is greedy so it >only matches once at the start of the string and then there can't be any >more matches. Conversly: ".*$" and ".*\z" will still have this problem, >because any number of matches can have the same ending offset. > > >-Hoss
Hmmm, given the chance perl behaves the same. Although attempting to use /*/ fails. Another lesson learnt! #! /usr/local/bin/perl use strict; my($s)="cat mat rat hat"; my($c)=0; print " a-match", ++$c, "='$1'\n" while( $s =~ m/(at)/g ); $c=0; print " b-match", ++$c, "='$1'\n" while( $s =~ m/(.*)/g ); $c=0; print " c-match", ++$c, "='$1'\n" while( $s =~ m/^(.*)/g ); $c=0; print " d-match", ++$c, "='$1'\n" while( $s =~ m/(.*)$/g ); -- =============================================================== Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===============================================================