I wish to give the subject its own topic. The first part of this mail is a
repeat. The second part is my attempt to implement html entities in page
names using the latest BW release (v2.2.8), by hacking the barn script
files, as a proof of concept proposal for BW core changes.

I am exploring how to add the full use of html entities in page names. Since
BW already accepts UTF-8 characters in page names, it is only logical to
create the easiest mechanisms which support input of html entities in page
names. Just that. We can already create page names including the characters
which are otherwise entered in a documents content as html entities: by
inserting the % encoded characters directly. It is not as if they are not
allowed. By adding the necessary html decoding before url encoding we have
the simple means of entering html entities in page names. The special
handling of '&' is needed for that, and escaping '&' for url encoding makes
it impossible to enter html entities in page names.

The scope of the characters covered by html entities (and thus correctly
represented by browsers) is quite large and merits attention:
see http://htmlhelp.com/reference/html40/entities/
and the lists of characters available in HTML 4:
1. Latin-1 Entities<http://htmlhelp.com/reference/html40/entities/latin1.html>
2. Entities for Symbols and Greek
Letters<http://htmlhelp.com/reference/html40/entities/symbols.html>
3. Special Entities<http://htmlhelp.com/reference/html40/entities/special.html>

It seems strange to open up BW for easy use of thousands of Chinese
characters in page names, and the whole scope of unicode characters from
many languages, yet make it impossible to enter html entities.

Here are the current (v2.2.8) core hacks to achieve easy html entitity
additions to page names:

1. in engine.php ca. line 61 remove '&' from $BOLTutfEscapeChars array.

2. in engine.php function BOLTredirect() ca. line 1695 add line
        $nextpage = BOLTutf2url($nextpage);

3. in engine.php function BOLTutf2url() ca. line 2121 add line
      $x = html_entity_decode($x, ENT_QUOTES, 'UTF-8');
   before line   $x = urlencode($x);

4. in markups.php ca. line 18 before line MarkUp('pre', 'entities', ...) add
line
      MarkUp('pre', 'ampamp', '/&amp;amp;/', '%26');

This is my current test base, and I would be grateful if the above changes
cause any undesired results elsewhere.

In particular it would be interesting to know if the first point causes any
problems.

cheers,
Hans

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"BoltWire" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/boltwire?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to