Hi, I am seeing some odd regex behavior. Using the demo applet: http://jakarta.apache.org/oro/demo.html
I try the following pattern: <(script|object|applet|style|noscript)[^>]*>[\s\S]*?</\1[^>]*> or another alternate version of (with single line flag) <(script|object|applet|style|noscript)[^>]*>.*?</\1[^>]*> With the following test input: <td height="35" colspan="2" align="center" class="style1"> <script type="text/javascript"> function spawn(fileName,width,height) { window.open(fileName,'new','toolbar=0,location=0,directories=0,status=0,menubar=0,scrollbars=0,width='+width+',height='+height+',resizable=0'); } </script> <style type="text/css"> .Copyright { font-size: 10px; font-family: Verdana, Arial; color: #FFF; padding:2px; margin:0px; vertical-align:1px; line-height:11px; } .Copyright A { color: #FFF; } </style> <span class="Copyright">© 2006 <a href="http://www.domain.com/" target="_blank">Vantage Media Corporation</a> - <a href="JavaScript:spawn('http://www.domain.com/privacy.html','770','501');">Privacy Statement</a> - <a href="JavaScript:spawn('http://www.domain.com/feedback/?data=aHR0cDovL2NvbGxlZ2UudXMuY29tL2NlYy9mdXR1cmVkZWdyZWUvZGVzaWduLnBocA','460','520');">Send Us Feedback</a></span> </td> <td valign="top"> </td> </tr> </table> ================================================= And the first pattern matches twice (second pattern obviously doesn't match in the applet since the applet doesn't have the single line flag applied) But the following code: Perl5Compiler s_perlCompiler = new Perl5Compiler(); m_matcher = new Perl5Matcher(); m_matcher.setMultiline(false); Pattern m_forbiddenTagsWithContentPattern = s_perlCompiler.compile( "<(script|object|applet|style|noscript)[^>]*>[\\s\\S]*?</\1[^>]*>", Perl5Compiler.CASE_INSENSITIVE_MASK | Perl5Compiler.READ_ONLY_MASK); // remove content and tags that include script/applet/object etc StringSubstitution substitution1 = new StringSubstitution(SPACE); filteredStr = Util.substitute(m_matcher, m_forbiddenTagsWithContentPattern, substitution1, text, Util.SUBSTITUTE_ALL); // text is set as the above sample text. The subtitution does nothing. I even tried: PatternMatcherInput input = new PatternMatcherInput(text); while(m_matcher.contains(input, pattern)) { System.out.println("In manual strip method - Found match btw:" + input.getMatchBeginOffset() + "," + input.getMatchEndOffset() + ":" + input.substring(input.getMatchBeginOffset(), input.getMatchEndOffset())); } And the above logs nothing. I tried compiling the pattern with the SINGLE_LINE_MASK but that made no difference. Any ideas/help would be appreciated. TIA, CJ __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]