Okay, well for some reason I am not getting escaped html even though the
comment wrapper is escaping html from the comment content before
returning it, so that is one thing that needs to be fixed.
The other problem is that there isn't (and hasn't been) a way to get to
option #3 where html is fully allowed. This is what is used to render
comment content in macros in 3.1 ...
#if($config.commentEscapeHtml)
#set($content = $utils.escapeHTML($comment.content))
#else
#set($content = $utils.transformToHTMLSubset($utils.escapeHTML($
comment.content)))
#end
... and that basically means your only options are #1 (no html) or #2
(partial html), which is not really what we are aiming for i don't think.
Then, even if html escaping is working properly for #1, i don't think
that's the right way of providing that functionality. If you don't want
people putting html in comments then we should be rejecting those
comments, not just leaving the html there and escaping it. The problem
with just escaping the html is that 1) it's ugly and if someone puts
lots of html in their comment and it gets escaped then it makes the
comment almost unreadable and 2) it's dangerous because technically if
at some later point in time html escaping isn't applied anymore then
potentially dangerous html in the comment can become active again. So
it makes far more sense to me that if the admin wants to prevent html in
comments then we should reject those comments with an error message.
So, getting back to my original suggestions again I think we can revise
a couple things.
1. I think we still need a new runtime property for "enable html in
comments?" This is kind of like a built in comment validator which if
enabled will reject comments with html in them.
2. For doing the html subset we can basically leave it how it is now as
a comment plugin and what it will do is the same thing it does now, just
escape all html then unescape the supported tags. So the expected use
case here would be if the user checked the "enable html in comments?"
box to allow for html in their comments, then checked the HTML Subset
plugin which transforms the comment contents so that only certain html
tags are left unescaped.
Does that sound good?
-- Allen
Dave wrote:
On 7/11/07, Allen Gilliland <[EMAIL PROTECTED]> wrote:
Okay, this work is basically complete, but after doing a fair amount of
testing I am a little uncertain if our handling of html in comments is
working how we expected. My understanding from looking at the UI is
that we are supposed to have these options ...
1. HTML disabled and not allowed in comments. Triggered by checking
"escape comment html" on global config page. This is also the property
which toggles the "HTML Syntax: Enabled/Disabled" text on the comment
form.
What this means is that, if you enter an HTML tag it will be escaped
on display and therefore tags you enter will be displayed as text and
not interpreted by the browser as HTML tags.
2. HTML subset available. Only certain html tags work. I'm not sure I
fully understand how this is supposed to be enforced, it looks like the
code tries to escape all html into < > syntax and then unescapes
certain tags.
In this case, we escape on display and then un-escape certain tags
that we deem to be safe.
3. HTML enabled. Basically, anything goes.
Yep, that's how they are supposed to work.
Now, from looking at the code it doesn't look like we are properly
providing these options. From my testing, when HTML is supposed to be
disabled it's not, and I could still put HTML in comments. And to a
larger degree, it looks like the escapeHTML() method isn't really doing
much. This basically means that the HTML subset option wasn't working
either because the comment was never being escaped in the first place,
so that option was just doing nothing.
Works for me. Escaping completely prevents HTML tags from being
interpreted by the browser.
So can someone verify that my assumptions above are correct and that's
the actual functionality we are shooting for?
I think that fixing #1 should be relatively easy and we can just display
a comment error message to users if they put html in a comment when they
aren't supposed to.
If we escape (as we are doing now) that works just fine.
To handle the conditions for #1 and #3 I'd like to
just migrate the old "escape comment html" property to a new one called
"enable html in comments" and if it's enabled then we don't do anything
to check html in comments, if it's disabled then we reject any comments
with html in them and display and error message.
#2 is a bit more tricky since you then need to be parsing the comment
and looking for specific html. At the end of the day this one feels
more like a comment validator than a built-in option to me because
effectively all you are trying to do is check if an incoming comment
uses any html that you are not allowing, which is the purpose of our
comment validators. So I think that one should be rewritten as a
comment validator.
What's the problem with the white-list based HTML subsetting we have now?
- Dave