resending...

Moriyoshi Koizumi <[EMAIL PROTECTED]> wrote:

> I've made a revised patch. I don't see there's no more BC problem around
> it. Please review it.
> 
> PS: I finally ended my search without a hit on such a discussion: Could 
> you give me a pointer where that discussion took place if it doesn't take 
> your time? I've ever looked over the archives of php-dev, php-cvs and 
> php-i18n.
> 
> Moriyoshi Koizumi
> 
> 
> "Wez Furlong" <[EMAIL PROTECTED]> wrote:
> 
> > On 10/17/02, "Moriyoshi Koizumi" <[EMAIL PROTECTED]> wrote:
> > > Yep, as far as I read the archives, I haven't found any discussions on the 
> > > charset related backwards problems. So I wrote "*exactly* about this 
> > > issue".
> > 
> > Search for "htmlentities charset".  Both myself and thies (and probably others
> > were discussing this).
> > In short: there are many, many, many people who have scripts that rely
> > on htmlentities defaulting to iso-8859-1 (the documented default for ever).
> > 
> > > I'm going to read archives more carefully, though I think even handling 
> > > the charset in phpinfo() will yield the same discussion in the future.
> > 
> > This is a separate issue and nothing to do with changing the behaviour of
> > htmlentities().
> > 
> > --Wez.
> > 
> > 
> > -- 
> > PHP Development Mailing List <http://www.php.net/>
> > To unsubscribe, visit: http://www.php.net/unsub.php
> > 
> 
> 
> -- 
> PHP Development Mailing List <http://www.php.net/>
> To unsubscribe, visit: http://www.php.net/unsub.php
> 
Index: html.c
===================================================================
RCS file: /repository/php4/ext/standard/html.c,v
retrieving revision 1.54
diff -u -r1.54 html.c
--- html.c      3 Oct 2002 12:06:52 -0000       1.54
+++ html.c      17 Oct 2002 17:58:19 -0000
@@ -31,6 +31,11 @@
 #include <langinfo.h>
 #endif
 
+#if HAVE_MBSTRING
+# include "ext/mbstring/mbstring.h"
+ZEND_EXTERN_MODULE_GLOBALS(mbstring)
+#endif
+
 enum entity_charset { cs_terminator, cs_8859_1, cs_cp1252,
                                          cs_8859_15, cs_utf_8, cs_big5, cs_gb2312, 
                                          cs_big5hkscs, cs_sjis, cs_eucjp};
@@ -525,6 +530,36 @@
                return cs_8859_1;
 
        if (strlen(charset_hint) == 0)  {
+#if HAVE_MBSTRING
+       /* XXX: Ugly things. Why don't we look for a more sophisticated way? */
+               switch (MBSTRG(internal_encoding)) {
+                       case mbfl_no_encoding_utf8:
+                               return cs_utf_8;
+
+                       case mbfl_no_encoding_euc_jp:
+                       case mbfl_no_encoding_eucjp_win:
+                               return cs_eucjp;
+
+                       case mbfl_no_encoding_sjis:
+                       case mbfl_no_encoding_sjis_win:
+                       case mbfl_no_encoding_sjis_mac:
+                               return cs_sjis;
+
+                       case mbfl_no_encoding_cp1252:
+                               return cs_cp1252;
+
+                       case mbfl_no_encoding_8859_15:
+                               return cs_8859_15;
+
+                       case mbfl_no_encoding_big5:
+                               return cs_big5;
+
+                       case mbfl_no_encoding_euc_cn:
+                       case mbfl_no_encoding_hz:
+                       case mbfl_no_encoding_cp936:
+                               return cs_gb2312;
+               }
+#endif
                /* try to detect the charset for the locale */
 #if HAVE_NL_LANGINFO && HAVE_LOCALE_H && defined(CODESET)
                charset_hint = nl_langinfo(CODESET);
-- 
PHP Development Mailing List <http://www.php.net/>
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to