* Louis-David Mitterrand <vindex+lists-markdown-disc...@apartia.org> [2010-05-05 16:05]: > What would be a "reasonable defaults" whitelist for html tags > in a forum context?
All the tags Markdown has syntax for: em strong a img code br p ul ol li blockquote pre h1 h2 h3 h4 h5 h6 Plus a few very reasonable extras: i b cite del ins dl dd dt Attributes that should be allowed: a: href title img: src alt title ol: start blockquote: cite That’s the minimal reasonable set, I think. You may or may not want to also whitelist the table-related tags: table tr td th tbody tfoot thead caption Most of their possible attributes should be allowed in that case. For those, you’ll need to tidy the HTML, not just scrub it, else people will be able to break your layout in malicious ways. You ***DON’T*** want to whitelist the `style` attribute under any circumstances, unless you also have a very very very careful CSS scrubber, because otherwise it’s possible to inject Javascript that way. You’ll also want to validate `...@href` values to keep people from putting `javascript:` URIs or similar foolishness in there. If in doubt, allow too little. That’s the main considerations out of the way. Personally I’d also whitelist `small` and `big`, much like `i` and `b`. You need the latter because `em` and `strong` are wrong to use for some well-reasoned formatting that isn’t emphatic (such as italicising names in citations) – likewise, if you only leave the header tags for smaller/bigger text, people will abuse them for setting large or small text that’s not a headline. For similar reasons, I’d also whitelist `tt`. Regards, -- Aristotle Pagaltzis // <http://plasmasturm.org/> _______________________________________________ Markdown-Discuss mailing list Markdown-Discuss@six.pairlist.net http://six.pairlist.net/mailman/listinfo/markdown-discuss