Re: [PHP-DEV] [PATCH] Changing entity charsethandlinginext/standard/html.c

2002-10-18 Thread Moriyoshi Koizumi
resending...


Moriyoshi Koizumi [EMAIL PROTECTED] wrote:

 I've made a revised patch. I don't see there's no more BC problem around
 it. Please review it.
 
 PS: I finally ended my search without a hit on such a discussion: Could 
 you give me a pointer where that discussion took place if it doesn't take 
 your time? I've ever looked over the archives of php-dev, php-cvs and 
 php-i18n.
 
 Moriyoshi Koizumi
 
 
 Wez Furlong [EMAIL PROTECTED] wrote:
 
  On 10/17/02, Moriyoshi Koizumi [EMAIL PROTECTED] wrote:
   Yep, as far as I read the archives, I haven't found any discussions on the 
   charset related backwards problems. So I wrote *exactly* about this 
   issue.
  
  Search for htmlentities charset.  Both myself and thies (and probably others
  were discussing this).
  In short: there are many, many, many people who have scripts that rely
  on htmlentities defaulting to iso-8859-1 (the documented default for ever).
  
   I'm going to read archives more carefully, though I think even handling 
   the charset in phpinfo() will yield the same discussion in the future.
  
  This is a separate issue and nothing to do with changing the behaviour of
  htmlentities().
  
  --Wez.
  
  
  -- 
  PHP Development Mailing List http://www.php.net/
  To unsubscribe, visit: http://www.php.net/unsub.php
  
 
 
 -- 
 PHP Development Mailing List http://www.php.net/
 To unsubscribe, visit: http://www.php.net/unsub.php
 

Index: html.c
===
RCS file: /repository/php4/ext/standard/html.c,v
retrieving revision 1.54
diff -u -r1.54 html.c
--- html.c  3 Oct 2002 12:06:52 -   1.54
+++ html.c  17 Oct 2002 17:58:19 -
@@ -31,6 +31,11 @@
 #include langinfo.h
 #endif
 
+#if HAVE_MBSTRING
+# include ext/mbstring/mbstring.h
+ZEND_EXTERN_MODULE_GLOBALS(mbstring)
+#endif
+
 enum entity_charset { cs_terminator, cs_8859_1, cs_cp1252,
  cs_8859_15, cs_utf_8, cs_big5, cs_gb2312, 
  cs_big5hkscs, cs_sjis, cs_eucjp};
@@ -525,6 +530,36 @@
return cs_8859_1;
 
if (strlen(charset_hint) == 0)  {
+#if HAVE_MBSTRING
+   /* XXX: Ugly things. Why don't we look for a more sophisticated way? */
+   switch (MBSTRG(internal_encoding)) {
+   case mbfl_no_encoding_utf8:
+   return cs_utf_8;
+
+   case mbfl_no_encoding_euc_jp:
+   case mbfl_no_encoding_eucjp_win:
+   return cs_eucjp;
+
+   case mbfl_no_encoding_sjis:
+   case mbfl_no_encoding_sjis_win:
+   case mbfl_no_encoding_sjis_mac:
+   return cs_sjis;
+
+   case mbfl_no_encoding_cp1252:
+   return cs_cp1252;
+
+   case mbfl_no_encoding_8859_15:
+   return cs_8859_15;
+
+   case mbfl_no_encoding_big5:
+   return cs_big5;
+
+   case mbfl_no_encoding_euc_cn:
+   case mbfl_no_encoding_hz:
+   case mbfl_no_encoding_cp936:
+   return cs_gb2312;
+   }
+#endif
/* try to detect the charset for the locale */
 #if HAVE_NL_LANGINFO  HAVE_LOCALE_H  defined(CODESET)
charset_hint = nl_langinfo(CODESET);

-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [PHP-DEV] [PATCH] Changing entity charsethandlinginext/standard/html.c

2002-10-18 Thread Moriyoshi Koizumi
I've made a revised patch. I don't see there's no more BC problem around
it. Please review it.

PS: I finally ended my search without a hit on such a discussion: Could 
you give me a pointer where that discussion took place if it doesn't take 
your time? I've ever looked over the archives of php-dev, php-cvs and 
php-i18n.

Moriyoshi Koizumi


Wez Furlong [EMAIL PROTECTED] wrote:

 On 10/17/02, Moriyoshi Koizumi [EMAIL PROTECTED] wrote:
  Yep, as far as I read the archives, I haven't found any discussions on the 
  charset related backwards problems. So I wrote *exactly* about this 
  issue.
 
 Search for htmlentities charset.  Both myself and thies (and probably others
 were discussing this).
 In short: there are many, many, many people who have scripts that rely
 on htmlentities defaulting to iso-8859-1 (the documented default for ever).
 
  I'm going to read archives more carefully, though I think even handling 
  the charset in phpinfo() will yield the same discussion in the future.
 
 This is a separate issue and nothing to do with changing the behaviour of
 htmlentities().
 
 --Wez.
 
 
 -- 
 PHP Development Mailing List http://www.php.net/
 To unsubscribe, visit: http://www.php.net/unsub.php
 


-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php