From: mbjr at mbjr dot hu Operating system: Linux PHP version: 5.1.2 PHP Bug Type: Feature/Change Request Bug description: UTF-8 DeAccentizer
Description: ------------ Although UTF-8 is becoming widely supported, many people in relevant countries are placing search string w/o any accents and special characters, as they got used to the old system. The only way atm to produce accent-free string is manual strtr for in every case when such character is found. Reproduce code: --------------- n/a Expected result: ---------------- Árvíztűrő tükörfúrógép -> Arvizturo tukorfurogep These all below should be converted to "o": Ò = capital letter o with grave Ó = capital letter o with acute Ô = capital letter o with circumflex Õ = capital letter o with tilde Ö = capital letter o with diaeresis Ō = capital letter o with macron Ŏ = capital letter o with breve Ő = capital letter o with double acute Ơ = capital letter o with horn Ǒ = capital letter o with caron Ǫ = capital letter o with ogonek Ǭ = capital letter o with ogonek and macron Ȍ = capital letter o with double grave Ȏ = capital letter o with inverted breve Ȫ = capital letter o with diaeresis and macron Ȭ = capital letter o with tilde and macron Ȯ = capital letter o with dot above Ȱ = capital letter o with dot above and macron Ṍ = capital letter o with tilde and acute Ṏ = capital letter o with tilde and diaeresis Ṑ = capital letter o with macron and grave Ṓ = capital letter o with macron and acute Ọ = capital letter o with dot below Ỏ = capital letter o with hook above Ố = capital letter o with circumflex and acute Ồ = capital letter o with circumflex and grave Ổ = capital letter o with circumflex and hook above Ỗ = capital letter o with circumflex and tilde Ộ = capital letter o with circumflex and dot below Ớ = capital letter o with horn and acute Ờ = capital letter o with horn and grave Ở = capital letter o with horn and hook above Ỡ = capital letter o with horn and tilde Ợ = capital letter o with horn and dot below Those 34 pieces above are latin capital letters but there're another 34 pieces for their small case, which means in the extended latin script set we have 68 matches for an "o". Same applies to e,u,i,a Actual result: -------------- n/a -- Edit bug report at http://bugs.php.net/?id=36130&edit=1 -- Try a CVS snapshot (PHP 4.4): http://bugs.php.net/fix.php?id=36130&r=trysnapshot44 Try a CVS snapshot (PHP 5.1): http://bugs.php.net/fix.php?id=36130&r=trysnapshot51 Try a CVS snapshot (PHP 6.0): http://bugs.php.net/fix.php?id=36130&r=trysnapshot60 Fixed in CVS: http://bugs.php.net/fix.php?id=36130&r=fixedcvs Fixed in release: http://bugs.php.net/fix.php?id=36130&r=alreadyfixed Need backtrace: http://bugs.php.net/fix.php?id=36130&r=needtrace Need Reproduce Script: http://bugs.php.net/fix.php?id=36130&r=needscript Try newer version: http://bugs.php.net/fix.php?id=36130&r=oldversion Not developer issue: http://bugs.php.net/fix.php?id=36130&r=support Expected behavior: http://bugs.php.net/fix.php?id=36130&r=notwrong Not enough info: http://bugs.php.net/fix.php?id=36130&r=notenoughinfo Submitted twice: http://bugs.php.net/fix.php?id=36130&r=submittedtwice register_globals: http://bugs.php.net/fix.php?id=36130&r=globals PHP 3 support discontinued: http://bugs.php.net/fix.php?id=36130&r=php3 Daylight Savings: http://bugs.php.net/fix.php?id=36130&r=dst IIS Stability: http://bugs.php.net/fix.php?id=36130&r=isapi Install GNU Sed: http://bugs.php.net/fix.php?id=36130&r=gnused Floating point limitations: http://bugs.php.net/fix.php?id=36130&r=float No Zend Extensions: http://bugs.php.net/fix.php?id=36130&r=nozend MySQL Configuration Error: http://bugs.php.net/fix.php?id=36130&r=mysqlcfg