Matt Raible wrote:
I've been skimming this thread, but not thoroughly reading each message. One
of the things I'd like to see (that doesn't happen now) is to allow full
HTML - no ifs ands or buts. Meaning that I can add JavaScript or any HTML
tags I'd like to a comment. In addition, it would be nice to add some
comment-post pre-processors that allow replacing \n with <br/> before the
entry gets put in the database.
The first feature is for me. I seem to comment as much on my blog as others
do. It's frustrating for me when my HTML gets escaped. Furthermore, since I
get almost instant notification of comments - I know when to delete those
that have XSS text.


yep, those are both things that we are trying to solve for with this work. i've already introduced the concept of comment plugins into the trunk and those basically work like our entry plugins, they get applied to comments as they are rendered and allow you to transform the text any way you want. we have 2 plugins migrated from old functionality, one which does the autoformatting as you described (text paragraphing to html paragraphing) and one which restricts the content to only a subset of html.

i'm not sure we'll have a way to configure things so that certain users have different restrictions on what they can post in comments though.



The 2nd feature is for readers. A lot of people comment on my blog w/o doing
"preview" first and end up with comments that run together - even though
there's line breaks in their entry. I realize I can change the macro to
render the output different, but that affects all comments - whereas I just
want to affect new ones. I regularly go into my database and format comments
by adding <br/><br/> tags, but it'd be nice not to have to do that.


yep, that's in the autoformatting plugin now. so if you want to do the safe thing you will be able to disable html in comments and then apply the autoformat plugin and it will format things appropriately.

-- Allen



Matt

On 7/12/07, Allen Gilliland <[EMAIL PROTECTED]> wrote:



Dave wrote:
> On 7/11/07, Allen Gilliland <[EMAIL PROTECTED]> wrote:
>> I personally like the idea of rejecting comments with html in them when
>> the admin has disabled html in comments because it seems like the
safest
>> and most consistent thing to do. My feeling is that it's more likely
>> that users will enter in a comment and try to use html than it is that
>> users will be using the < and > characters in a non-html fashion.  So
>> rejecting the comment and telling them not to use those characters will
>> be appropriate and fitting most of the time.
>>
>> There certainly will be some cases where users aren't trying to put in
>> html and their comment gets rejected, but my feeling is that those
cases
>> will be relatively few and with a decent error message they should be
>> able to fix it without much of a problem.
>
> I don't like the idea of rejecting a comment because it has a bracket
> that definitely doesn't seem right. If HTML Is disabled then we should
> treat the comment as plain text. An angle bracket is allowed in plain
> text and is not considered HTML.


I don't like it either, but I'm not sure that will happen all that
often.  What we really need is a better parser so that we could actually
distinguish between real html and casual use of the < and > characters.


>
> I think the most consistent thing to do is this:
>
> If HTML is disabled in comments that means that Roller consider the
> comment content to be of type text/plain. Add a content-type field to
> the comment table so we can save that. And whenever Roller displays
> content of type text/plain in HTML or XML it must be escaped.
>
> If HTML is enabled then consider the content type to be text/html.
> When we display HTML in a comment we always do the HTML
> subsetting/white-listing bit. I don't think we should ever display raw
> comment HTML.


Okay, that sounds like a pretty good plan to me.  However, I'm not sure
I agree about forcing the HTML subsetting/white-listing part.  While I
agree that it is typically dangerous and undesirable, we have to
remember that many times blog applications are used in trusted
environments now, such as corporate intranets, and in those cases it
would be entirely valid to just allow any html in comments.

I guess I also think that it seems silly to not make that an option,
since undoubtably someone will want to do things that way.

So I will add one more column to the roller_comment table for
content-type and tweak the upgrade process so that it gets properly set
based on the previous value of the 'users.comments.escapehtml' property.
  If that property was enabled then the content-type will be text/plain
and if not then it will be text/html.


>
> Going farther down that road, maybe we shouldn't have a "Allow HTML in
> comments" property. Instead, have a "Set content-type to be used for
> new comments" and offer options Text, HTML and XHTML.
>
> This will also make it clear how comments should be treated when in a
> feed. See the Atom <content> elements type attribute here:
> http://tools.ietf.org/html/rfc4287#page-14


I think that's fine, but it's extra effort to do that and I don't have
time for it right now.  I think just sticking with a simple html or
plain text option will be fine for now, and there won't be any issues
with expanding on that down the road and allowing admins to choose their
own content-type if they want.

So for right now I'm just going to leave things as is and provide just a
single "Allow HTML in comments?" checkbox, which is effectively the same
functionality we have right now.  If HTML is allowed then the
content-type for incoming comments will be set to text/html, otherwise
it will be set to text/plain.

Cool?

-- Allen


>
>
> If we don't want to go down that road and add a content-type, I think
> we should stick with the practice of escaping comments that are not
> supposed to be HTML rather than rejecting any angle brackets or HTML
> tags. I don't think you have enough justification for making such a
> behavior change.
>
> - Dave
>
>
>
>
>
> - Dave




Reply via email to