Hey Moriyoshi, Sorry for my late entry into the debate, but I run into htmlentities() default charset problem today. I wonder why did you opt to use mbstring ini setting (thus making this nice feature mbstring dependant) when we have "default_charset" ini setting.
It just sounds more logical to me to use SG(default_charset) for the default charset of htmlentities(). Your thoughts? Edin ----- Original Message ----- From: "Moriyoshi Koizumi" <[EMAIL PROTECTED]> To: "Wez Furlong" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Thursday, October 17, 2002 7:48 AM Subject: Re: [PHP-DEV] [PATCH] Changing entity charset handlinginext/standard/html.c > Yep, as far as I read the archives, I haven't found any discussions on the > charset related backwards problems. So I wrote "*exactly* about this > issue". > > You may want to redirect me to bug #9392 (http://bugs.php.net/bug.php?id=9392), but it doens't seem to help... > > In addition, I found determining the internal charset by LC_CTYPE is > dangerous because setlocale() is not thread-safe in some libc > implementations (glibc seems to be that one). > > I'm going to read archives more carefully, though I think even handling > the charset in phpinfo() will yield the same discussion in the future. > > > Moriyoshi Koizumi > > "Wez Furlong" <[EMAIL PROTECTED]> wrote: > > > Search the archives for the discussion. > > phpinfo could determine the charset as your patch does at the start, > > and then pass the info in php_escape_html_entities. > > > > Seems easy to me. > > > > --Wez. > > > > On 10/16/02, "Moriyoshi Koizumi" <[EMAIL PROTECTED]> wrote: > > > Wez Furlong <[EMAIL PROTECTED]> wrote: > > > > Unfortunately, we absolutely must remain 100% backwards compatible with > > > > htmlentities(), so this patch should not be applied. > > > > > > Were there any discussions exactly about this issue? Though I have to see > > > some historical reason, however I don't understand why 100% backwards > > > compatibility is required for htmlentities(). > > > Because the patched htmlentities() acts in the same way with default > > > configuration, and IMHO defaulting to iso-8859-1 is quite meaningless for > > > the scripts that uses other charsets than it. > > > > > > Hmm... otherwise I would like to suggest a mbstring function like > > > mb_htmlentities(), but it would sound like a reinvention of the same > > > wheel... > > > > > > > However, I don't see a problem with making phpinfo determine the charset > > > > and passing that on to the internal htmlentities function? > > > > > > The problem is that php_info_html_esc() in ext/standard/info.c calls > > > php_escape_html_entities() with no charset information specified. Without > > > the patch, every character is treated as ISO-8859-1 even if a fetched > > > character is actually a mere first byte of a multibyte character. > > > > > > > > > Moriyoshi Koizumi > > > > > > > > > > > > -- > > > PHP Development Mailing List <http://www.php.net/> > > > To unsubscribe, visit: http://www.php.net/unsub.php > > > > > > > > > -- > PHP Development Mailing List <http://www.php.net/> > To unsubscribe, visit: http://www.php.net/unsub.php > > > -- PHP Development Mailing List <http://www.php.net/> To unsubscribe, visit: http://www.php.net/unsub.php