I wanted strip_tags() for sanitization in vibe.d and I set out for algorithms on how to do it and came across this JavaScript library at https://github.com/ericnorris/striptags/blob/master/src/striptags.js which is quite popular judging by the number of likes and forks. As a looked through, I didn't like the cumbersome approach it used so I tried to implement it in my own way. This is what I lazily did. It turns out to be so simple that I thought I could use some opinion. Notice I didn't add `tag_replacement` param but that's just like one line of code.

string stripTags(string input, in string[] allowedTags = [])
{
        import std.regex: Captures, replaceAll, ctRegex;

        auto regex = ctRegex!(`</?(\w*)>`);

        string regexHandler(Captures!(string) match)
        {
            string insertSlash(in string tag)
            in
            {
assert(tag.length, "Argument must contain one or more characters");
            }
            body
            {
                return tag[0..1] ~ "/" ~ tag[1..$];
            }

            bool allowed = false;
            foreach (tag; allowedTags)
            {
                if (tag == match.hit || insertSlash(tag) == match.hit)
                {
                        allowed = true;
                        break;
                }
            }
            return allowed ? match.hit : "";
        }

        return input.replaceAll!(regexHandler)(regex);
}

unittest
{
        assert(stripTags("<html><b>bold</b></html>") == "bold");
assert(stripTags("<html><b>bold</b></html>", ["<html>"]) == "<html>bold</html>");
}



I'm not sure the tags matching regex I used is the best though.

Reply via email to