On Wed, Mar 11, 2009 at 9:03 AM, Linly <[email protected]> wrote:
>
>> First we need to really think about the best place to encode/decode
>> the utf chars so that we have maximum, and simplest functionality.
>> Switching the decoding to the final output of the markup table (as
>> Hans suggested) will simplify a lot of code and make sure we never see
>> a %encoded url anywhere (unless escaped--a whole other issue). And
>> while probably don't want the underlying page content %encoded (for
>> those who snoop around in there), there are some real possibilities if
>> we did that could make UTF page names work anywhere in the system. For
>> example, there's no reason we couldn't  define some custom system var
>> like {二} and get it to work. Or a function like [(二 ....)]. (We might
>> already be able to do this with mapping). Crazy possibilities.
>
> I think if we came to "{{p}::二}", that would be far enough. :)

For now, perhaps. But I'm sure someone is going to want to have some
info retrieved on a non ascii page. :)  Probably you first, Linly!
Perhaps a report showing all the page summaries or something, with
some pages being utf?  And being able to do {二} might be really
trivial if we get the coding/unencoding timed right. It's just the
implications that are interesting.

I just verified we can get function mapping already.  For instance put
this in config.php:

$BOLTtoolmap['f']['追寻'] = 'search';

Then in a page put this  :)

[(追寻 group=site)]

Though no one has done much with mapping yet, it is a very powerful
thing. I think we could develop translations where a language file not
only translates the messages, buttons, etc., but also includes some
kind of chinese.php file you simply enable and all your functions,
commands, conditions are instantly remapped to chinese equivalents.
(The default english still work, the chinese just get mapped to their
english equivalent).  Cool?

>> 1) Security. Though I haven't yet tested, I'm concerned someone could
>> url_encode a XSS hack, drop it into any BoltWire comment box, and
>> wreak havoc. It would bypass all filters (I have had to add %'s now to
>> most filters to admit page names), and then if BoltWire blindly
>> decoded everything, it could output perfectly formed javascript to the
>> page. This may already be a vulnerability if you have the new utf
>> pagenames enabled.
>
> I know nothing about this, but why not just put a pair of <code></
> code> surrounding the comment input, forcing all of the content become
> pure text? No code allowed. Many blog script use this approach to
> prevent xss or other things.

That can be done, as long as you don't want to allow markup in the
comments, but some might want to allow certain markups. Actually, that
suggests a  need for something like <limit rules=vars,fmt,links>
</limit> for situations like this. Easy to do... Great idea.

But beside the point, we can't count on every user to be wise. We have
to have a way to protect them from themselves. I've just verified on
my own system there are some definite vulnerabilities with the utf
pages. Hans has point out others offlist. Unfortunately, these both
occur even with utf pages turned off. So this is something we need to
be careful about, on our end, and not put that responsibility on the
user end.

Cheers,
Dan

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"BoltWire" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/boltwire?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to