Hi!

Belated happy new year to you all!

I'm working on a news script, and needed a way to strip input from
HTML-tags, so I conjured up this little function:

detag: func [
    "Removes HTML-tags from a string, file or url, leaves special characters
intact, except  ."
     source [string!] "String, file or url to detag."
     'target [word!] "Copies the result to this target."
     /custom block [block!] "Define a block of special characters to be
replaced, i.e. [^"Á^" ^"Â^"]."
     /local tag string list a b
][
    list: ["<br>" " " "</p>" "^/" "&nbsp;" " "]
    if custom [append list block]
    string: copy source
    set to-word get 'target string
    for i 1 (length? list) 2 [
        a: i
        b: i + 1
        replace/all get target list/:a list/:b
    ]
    while [
        parse get target [to "<" thru ">" to end]
    ][
        parse get target [to "<" copy tag thru ">" (remove/part find get
target tag length? tag) to end]
    ]
    get target
]

Feel free to optimise all this, I bloated it a little bit, because of <br>,
</p> and &nbsp; - so I've added a /custom refinement. Use it to strip any
special characters. I wanted to add a refinement that did that
automatically, but I was too lazy to figure out a really efficient way to
handle the special characters (like: & - any letter - acute/circ/etc. - ;)

Have fun, keep coding!

Regards,
Rachid

Reply via email to