Re: comment formatting and escaping is a bit kludgy

Allen Gilliland Wed, 11 Jul 2007 17:16:32 -0700

Okay, well for some reason I am not getting escaped html even though thecomment wrapper is escaping html from the comment content beforereturning it, so that is one thing that needs to be fixed.

The other problem is that there isn't (and hasn't been) a way to get tooption #3 where html is fully allowed. This is what is used to rendercomment content in macros in 3.1 ...


#if($config.commentEscapeHtml)
  #set($content = $utils.escapeHTML($comment.content))
#else
  #set($content = $utils.transformToHTMLSubset($utils.escapeHTML($
comment.content)))
#end

... and that basically means your only options are #1 (no html) or #2(partial html), which is not really what we are aiming for i don't think.

Then, even if html escaping is working properly for #1, i don't thinkthat's the right way of providing that functionality. If you don't wantpeople putting html in comments then we should be rejecting thosecomments, not just leaving the html there and escaping it. The problemwith just escaping the html is that 1) it's ugly and if someone putslots of html in their comment and it gets escaped then it makes thecomment almost unreadable and 2) it's dangerous because technically ifat some later point in time html escaping isn't applied anymore thenpotentially dangerous html in the comment can become active again. Soit makes far more sense to me that if the admin wants to prevent html incomments then we should reject those comments with an error message.

So, getting back to my original suggestions again I think we can revisea couple things.

1. I think we still need a new runtime property for "enable html incomments?" This is kind of like a built in comment validator which ifenabled will reject comments with html in them.

2. For doing the html subset we can basically leave it how it is now asa comment plugin and what it will do is the same thing it does now, justescape all html then unescape the supported tags. So the expected usecase here would be if the user checked the "enable html in comments?"box to allow for html in their comments, then checked the HTML Subsetplugin which transforms the comment contents so that only certain htmltags are left unescaped.


Does that sound good?

-- Allen


Dave wrote:

On 7/11/07, Allen Gilliland <[EMAIL PROTECTED]> wrote:

Okay, this work is basically complete, but after doing a fair amount of
testing I am a little uncertain if our handling of html in comments is
working how we expected.  My understanding from looking at the UI is
that we are supposed to have these options ...

1. HTML disabled and not allowed in comments.  Triggered by checking
"escape comment html" on global config page.  This is also the property

which toggles the "HTML Syntax: Enabled/Disabled" text on the commentform.


What this means is that, if you enter an HTML tag it will be escaped
on display and therefore tags you enter will be displayed as text and
not interpreted by the browser as HTML tags.

2. HTML subset available.  Only certain html tags work.  I'm not sure I
fully understand how this is supposed to be enforced, it looks like the
code tries to escape all html into &lt; &gt; syntax and then unescapes
certain tags.


In this case, we escape on display and then un-escape certain tags
that we deem to be safe.

3. HTML enabled.  Basically, anything goes.



Yep, that's how they are supposed to work.

Now, from looking at the code it doesn't look like we are properly
providing these options.  From my testing, when HTML is supposed to be
disabled it's not, and I could still put HTML in comments.  And to a
larger degree, it looks like the escapeHTML() method isn't really doing
much.  This basically means that the HTML subset option wasn't working
either because the comment was never being escaped in the first place,
so that option was just doing nothing.


Works for me. Escaping completely prevents HTML tags from being
interpreted by the browser.

So can someone verify that my assumptions above are correct and that's
the actual functionality we are shooting for?

I think that fixing #1 should be relatively easy and we can just display
a comment error message to users if they put html in a comment when they
aren't supposed to.


If we escape (as we are doing now) that works just fine.


To handle the conditions for #1 and #3 I'd like to

just migrate the old "escape comment html" property to a new one called
"enable html in comments" and if it's enabled then we don't do anything
to check html in comments, if it's disabled then we reject any comments
with html in them and display and error message.

#2 is a bit more tricky since you then need to be parsing the comment
and looking for specific html.  At the end of the day this one feels
more like a comment validator than a built-in option to me because
effectively all you are trying to do is check if an incoming comment
uses any html that you are not allowing, which is the purpose of our
comment validators.  So I think that one should be rewritten as a
comment validator.


What's the problem with the white-list based HTML subsetting we have now?

- Dave

Re: comment formatting and escaping is a bit kludgy

Reply via email to