Daniel Staal wrote: ... > You definitely need the s/// operator, (unless you can use one of the > HTML parsing modules). But let's fix that regrexp first, shall we? > > First off, you may have noticed I removed the first '.*' from your > regrexp: that's because nothing is allowed between the opening '<' > and the name of the element. Unless, of course, it is a closing tag, > in which case you have a '/' in there. So, that would be: > s/\<\/?font.*\>//i > > Just a moment, that's ugly. Substitution allows different dividers, > let's use something else. I'll use '[' and ']'. So, re-written that > as: > s[\</?font.*\>][]i > (Note that we've dropped the escape on the slash: it is no longer > needed.) > > Ok, let's try that. Yikes!!! It matches _everything_ after the > first font tag!! Um, that greedy '.*' needs to be fixed, to stop as > soon as it can instead of matching as much as it can. We do that by > adding a '?' after it: > s[\</?font.*?\>][]i > > There, that's better. Oh, but there is one other problem: '.*?' > stops at a newline. That may sound fine, but a newline is legal > inside a HTML element tag... We change this by adding a 's' with the > 'i' modifier: > s[\</?font.*?\>][]si > > That should work. Of course, it only changes the first font tag it > finds... To fix that we need another modifier: 'g'. So the final > pattern is: > s[\</?font.*?\>][]gsi > > I think that covers everything... And it is a quick lession is why > we usually tell people not to try matching HTML with regrexps. > > Daniel T. Staal
Cool! Thanks, Daniel, that is very nice work. I could feel myself going back over those first steps in using regexes as I followed your post. Joseph -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]