Add [^>]*> at the end of the regexp. If you are on mx, you can write a
shorter regexp!!

> -----Original Message-----
> From: jean-marc bottin [mailto:[EMAIL PROTECTED]
> Sent: donderdag 3 juni 2004 15:45
> To: CF-Talk
> Subject: Regular _expression_ and HTML
>
> I got a RE that I have modified in order to parse some HTML
> and to only keep tag starting with a "<input", "<select" or
> "<textarea, however I am struggling with it.
>
> I have some HTML:
>
> <table width="100%"  border="0" cellspacing="0" cellpadding="0">
>   <tr>   
>     <td><input type="file" name="file_1" value="file_1"></td>
>     <td><input type="text" name="text_1" value="text_1"
> size="12" maxlength="18"></td>
>   </tr>
>   <tr>   
>     <td colspan="2"><textarea name="textarea_1" cols="12"
> rows="5" wrap="hard"></textarea></td>   
>   </tr>
>   <tr>
>     <td>
> <input type="radio" name="radio_1" value="1">
> <input type="radio" name="radio_1" value="2">
> <input type="radio" name="radio_1" value="3"></td>
>     <td>
> <select name="select_1">
>   <option value="value11" selected>a</option>
>   <option value="value12">b</option>
>   <option value="value13">c</option>
>   <option value="value14">d</option>
> </select>
> </td>
>   </tr>
>   <tr>
>    <td><input type="submit" name="submit_1" value="submit_1"></td>
> <td><input type="reset" name="reset_1" value="reset_1"></td>
>   </tr>
> </table>
>
>
> After using the RE I get that:
>
> cellSpacing=0 cellPadding=0 width="100%" border=0>
>
> >
> ><INPUT type=file name=file_1>>
> ><INPUT maxLength=18 size=12 value=text_1 name=text_1>>>
> >
>  colSpan=2><TEXTAREA name=textarea_1 rows=5 wrap=hard
> cols=12></TEXTAREA>>>
> >
> ><INPUT type=radio value=1 name=radio_1> <INPUT type=radio value=2
> >name=radio_1> <INPUT type=radio value=3 name=radio_1>> <SELECT
> >name=select_1> <OPTION value=value11 selected>a</OPTION> <OPTION
> >value=value12>b</OPTION> <OPTION value=value13>c</OPTION> <OPTION
> >value=value14>d</OPTION></SELECT> >>
> >
> ><INPUT type=submit value=submit_1 name=submit_1>> <INPUT type=reset
> >value=reset_1 name=reset_1>>>>
>
>
> I managed to take out nearly all the HTML but the closing tag
> are still there and the inside attributes are still there
> too, what do I miss here. I am not very familiar with RE and
> I am going crazy.
>
> Here is my RE:
>
> <cfset cleanFromHTML = REReplaceNoCase("#form.stFormStruct#",
> "(<\/?)(a(bbr|cronym|ddress|pplet|rea)?|b(ase(font)?|do|ig|loc
> kquote|ody|r|utton)?|c(aption|enter|ite|lass|(o(de|l(group)?))
)|d(d|el|fn|i(r|v)|l|t)|em|f(ieldset|o(nt|rm)|rame(set)?)|h([1-6]|ead|r|
tml)|i(frame|mg|s>
index)|kbd|l(abel|egend|i(nk)?)|m(ap|e(nu|ta))|no(frames|scrip
> t)|o(bject|l)|p(aram|re)?|q|s(amp|cript|mall|pan|t(r(ike|ong)|
yle)|u(b|p))|t(able|body|d|foot|h|itle|r|t)|u(l)?|var)", "", "all")>
>
> Thanks,
>
> Jean-Marc
>
>
>
[Todays Threads] [This Message] [Subscription] [Fast Unsubscribe] [User Settings]

Reply via email to