The concept of "greediness of the *" has been introduced. Let's expand. What this means is that when you parse any html or xml file, you have to be very careful if you know a same tag can occur many times in your document.

Simple example:
The <b> cat</b> under the <b>table</b> is...

if you use:
put replacetext(tText, "<b>.*</b>", "")

This will give you :
The  is...
because * tries to match as many characters as possible.

The way to handle this in php is to add a "?" after the *, to specifically indicate you want the "*" to be as ungreedy as possible
http://uk.php.net/manual/en/reference.pcre.pattern.modifiers.php

U (PCRE_UNGREEDY)
This modifier inverts the "greediness" of the quantifiers so that they are not greedy by default, but become greedy if followed by "?". It is not compatible with Perl. It can also be set by a (?U) modifier setting within the pattern or by a question mark behind a quantifier (e.g. .*?).

So, let's try:
put replacetext(tText, "<b>.*?</b>","")

He he, this gives the correct result:
The  under the  is...

------------------------------------------------------------------------ --------
Marielle Lange (PhD),  Psycholinguist

Alternative emails: [EMAIL PROTECTED], [EMAIL PROTECTED]
Homepage http://homepages.lexicall.org/mlange/
Easy access to lexical databases                    http://lexicall.org
Supporting Education Technologists http:// revolution.lexicall.org



_______________________________________________
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution

Reply via email to