Quoting Arie Folger <[EMAIL PROTECTED]>:

> Hi,
> 
> I modified phpnuke to allo utf8, and started filling the site with content
> (although for now the search function has been disabled because I expect it
> not to do Hebrew yet). Then, after viewing a Hebrew article as html source, I
> noticed that instead of unicode chararacters I got numbered entities. A quick
> look at teh MySql table revealed that everything was stored in numbered
> entities (what a waste of space).

Has your input come from Mozilla? It does that. To make sure, write a cgi script
(if you don't trust PHP) that displays its input as text/plain, and create a
form in UTF8 that sends to that script.

If my hypothesis is correct, what you'll have to do is detect, after each post,
whether the fields contain HTML entities, and convert them to normal characters,
before storing them in the database.
 
> Problem is that paragraphs in numbered entities are not entirely displayed as
> rtl, in that the paragraphs are left justified and the bulleted lists are
> backwards, even though the entire section is between <span
> dir="rtl">...</span> tags.

Spans are not the answer, because bulleted lists are considered "blocked
entities". Each of them should have a DIR=RTL or an appropriate CSS entry. To
the best of my knowledge, there is no difference between numbered entities and
proper characters, because at least theoretically, all numbered entities are
converted to the proper characters before the rendering is done.

Herouth

=================================================================
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]

Reply via email to