Req #38138 [Com]: russian encoding detection support
Edit report at http://bugs.php.net/bug.php?id=38138edit=1 ID: 38138 Comment by: rustamabd at gmail dot com Reported by:techto...@php.net Summary:russian encoding detection support Status: Open Type: Feature/Change Request Package:Feature/Change Request PHP Version:4.4.2 Block user comment: N Private report: N New Comment: Windows-1251, koi8-r, cp866 are all single-byte CHARSETs, not ENCODINGs. mb_detect_encoding() is not intended to distinguish between charsets, especially single-byte charsets. Its primary purpose is to detect which multibyte encoding is in use, i.e. UTF-8, UTF-16, shift-JIS, etc. Previous Comments: [2009-08-29 01:03:55] techto...@php.net It is not freeware - it's open source. That basically means that you have all the tools to do it yourself and contribute back for others. Or learn some other language. [2009-08-28 23:54:53] jehy at valar dot ru Already three years. Still no changes. That's why I hate freeware. [2009-03-20 11:14:20] wips at mail dot ru Another version of encoding detector http://popoff.donetsk.ua/file/text/libs/a.charset.php which works with utf8 too. [2009-01-21 08:46:35] Roman dot Kyrylych at gmail dot com here's a russian encoding autodetector that can be used after mb_detect_encoding returned false: http://www.opennet.ru/base/dev/charset_autodetect.txt.html [2008-10-11 04:40:21] maybeoutput at gmail dot com Two years old bug and still can't detect encoding? The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at http://bugs.php.net/bug.php?id=38138 -- Edit this bug report at http://bugs.php.net/bug.php?id=38138edit=1
#38138 [Com]: russian encoding detection support
ID: 38138 Comment by: alckonrojo at hotmail dot com Reported By: techto...@php.net Status: Open Bug Type:Feature/Change Request PHP Version: 4.4.2 New Comment: Hi all. Creative work is play. It is free speculation using materials of one's chosen form. Help me! I find sites on the topic: Beverly hills real estate ecommerce hosting. I found only this - a href=http://conferencebot.objectis.net/Members/EcommerceHosting/what-is-ecommerce-hosting;what is ecommerce hosting/a. Such billing of the pilar to fact providers, ecommerce hosting. Buds-sse itself continues feel of the speech for front cart, ecommerce trouble and price of reliability, ecommerce hosting. THX ;-), Ubadah from Togo. Previous Comments: [2009-08-29 01:03:55] techto...@php.net It is not freeware - it's open source. That basically means that you have all the tools to do it yourself and contribute back for others. Or learn some other language. [2009-08-28 23:54:53] jehy at valar dot ru Already three years. Still no changes. That's why I hate freeware. [2009-03-20 11:14:20] wips at mail dot ru Another version of encoding detector http://popoff.donetsk.ua/file/text/libs/a.charset.php which works with utf8 too. [2009-01-21 08:46:35] Roman dot Kyrylych at gmail dot com here's a russian encoding autodetector that can be used after mb_detect_encoding returned false: http://www.opennet.ru/base/dev/charset_autodetect.txt.html [2008-10-11 04:40:21] maybeoutput at gmail dot com Two years old bug and still can't detect encoding? The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at http://bugs.php.net/38138 -- Edit this bug report at http://bugs.php.net/?id=38138edit=1
#38138 [Com]: russian encoding detection support
ID: 38138 Comment by: jehy at valar dot ru Reported By: techto...@php.net Status: Open Bug Type:Feature/Change Request PHP Version: 4.4.2 New Comment: Already three years. Still no changes. That's why I hate freeware. Previous Comments: [2009-03-20 11:14:20] wips at mail dot ru Another version of encoding detector http://popoff.donetsk.ua/file/text/libs/a.charset.php which works with utf8 too. [2009-01-21 08:46:35] Roman dot Kyrylych at gmail dot com here's a russian encoding autodetector that can be used after mb_detect_encoding returned false: http://www.opennet.ru/base/dev/charset_autodetect.txt.html [2008-10-11 04:40:21] maybeoutput at gmail dot com Two years old bug and still can't detect encoding? [2008-10-04 23:07:03] dennis at nikolaenko dot ru I also bumped into this, just voting. [2006-07-20 07:04:21] techto...@php.net I would like to if anybody will explain how to port PHP functions into Unicode for dummies. It will also be nice to see an environment to monitor the changes (?trac) and control requirements. The last one is to help analyze deprecated, inconvenient and obscure API - logical bugs - to provide means to increase usability. Like unify inlcude_path delimiters on all platforms etc. It is just to save some time and make occasional development (which I am pretty restrained to) effective. The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at http://bugs.php.net/38138 -- Edit this bug report at http://bugs.php.net/?id=38138edit=1
#38138 [Com]: russian encoding detection support
ID: 38138 Comment by: wips at mail dot ru Reported By: techto...@php.net Status: Open Bug Type:Feature/Change Request PHP Version: 4.4.2 New Comment: Another version of encoding detector http://popoff.donetsk.ua/file/text/libs/a.charset.php which works with utf8 too. Previous Comments: [2009-01-21 08:46:35] Roman dot Kyrylych at gmail dot com here's a russian encoding autodetector that can be used after mb_detect_encoding returned false: http://www.opennet.ru/base/dev/charset_autodetect.txt.html [2008-10-11 04:40:21] maybeoutput at gmail dot com Two years old bug and still can't detect encoding? [2008-10-04 23:07:03] dennis at nikolaenko dot ru I also bumped into this, just voting. [2006-07-20 07:04:21] techto...@php.net I would like to if anybody will explain how to port PHP functions into Unicode for dummies. It will also be nice to see an environment to monitor the changes (?trac) and control requirements. The last one is to help analyze deprecated, inconvenient and obscure API - logical bugs - to provide means to increase usability. Like unify inlcude_path delimiters on all platforms etc. It is just to save some time and make occasional development (which I am pretty restrained to) effective. [2006-07-20 06:27:56] tony2...@php.net I evaluated PHP6 for a few days, but it was very far from being complete, unfortunately. I wonder why.. probably because it's still 12+ months before the release? =) Feel free to help us, though. The documentation is not the only area that needs some help =) The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at http://bugs.php.net/38138 -- Edit this bug report at http://bugs.php.net/?id=38138edit=1
#38138 [Com]: russian encoding detection support
ID: 38138 Comment by: Roman dot Kyrylych at gmail dot com Reported By: techto...@php.net Status: Open Bug Type:Feature/Change Request PHP Version: 4.4.2 New Comment: here's a russian encoding autodetector that can be used after mb_detect_encoding returned false: http://www.opennet.ru/base/dev/charset_autodetect.txt.html Previous Comments: [2008-10-11 04:40:21] maybeoutput at gmail dot com Two years old bug and still can't detect encoding? [2008-10-04 23:07:03] dennis at nikolaenko dot ru I also bumped into this, just voting. [2006-07-20 07:04:21] techto...@php.net I would like to if anybody will explain how to port PHP functions into Unicode for dummies. It will also be nice to see an environment to monitor the changes (?trac) and control requirements. The last one is to help analyze deprecated, inconvenient and obscure API - logical bugs - to provide means to increase usability. Like unify inlcude_path delimiters on all platforms etc. It is just to save some time and make occasional development (which I am pretty restrained to) effective. [2006-07-20 06:27:56] tony2...@php.net I evaluated PHP6 for a few days, but it was very far from being complete, unfortunately. I wonder why.. probably because it's still 12+ months before the release? =) Feel free to help us, though. The documentation is not the only area that needs some help =) [2006-07-19 17:34:30] techto...@php.net Well, i can't say this is ok for me. At first I thought that simple configure with --enable-mbstring=all should solve the problem, but it appeared that my host of dream already has this option turned on. So autodetection of russian language is just not enabled on code level, i.e. i18n support via mbstring is somehow crippled. I evaluated PHP6 for a few days, but it was very far from being complete, unfortunately. The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at http://bugs.php.net/38138 -- Edit this bug report at http://bugs.php.net/?id=38138edit=1
#38138 [Com]: russian encoding detection support
ID: 38138 Comment by: maybeoutput at gmail dot com Reported By: [EMAIL PROTECTED] Status: Open Bug Type:Feature/Change Request PHP Version: 4.4.2 New Comment: Two years old bug and still can't detect encoding? Previous Comments: [2008-10-04 23:07:03] dennis at nikolaenko dot ru I also bumped into this, just voting. [2006-07-20 07:04:21] [EMAIL PROTECTED] I would like to if anybody will explain how to port PHP functions into Unicode for dummies. It will also be nice to see an environment to monitor the changes (?trac) and control requirements. The last one is to help analyze deprecated, inconvenient and obscure API - logical bugs - to provide means to increase usability. Like unify inlcude_path delimiters on all platforms etc. It is just to save some time and make occasional development (which I am pretty restrained to) effective. [2006-07-20 06:27:56] [EMAIL PROTECTED] I evaluated PHP6 for a few days, but it was very far from being complete, unfortunately. I wonder why.. probably because it's still 12+ months before the release? =) Feel free to help us, though. The documentation is not the only area that needs some help =) [2006-07-19 17:34:30] [EMAIL PROTECTED] Well, i can't say this is ok for me. At first I thought that simple configure with --enable-mbstring=all should solve the problem, but it appeared that my host of dream already has this option turned on. So autodetection of russian language is just not enabled on code level, i.e. i18n support via mbstring is somehow crippled. I evaluated PHP6 for a few days, but it was very far from being complete, unfortunately. [2006-07-19 09:50:33] [EMAIL PROTECTED] Reclassified as feature request, where it belongs. techtonik, I'm sure you know email addresses of ext/mbstring maintainers and can contact them about it. Although, I don't think this will ever appear in PHP6 (because mbstring itself doesn't make much sense there) and it definitely won't appear in PHP4 (it's time to upgrade, eh?). The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at http://bugs.php.net/38138 -- Edit this bug report at http://bugs.php.net/?id=38138edit=1
#38138 [Com]: russian encoding detection support
ID: 38138 Comment by: dennis at nikolaenko dot ru Reported By: [EMAIL PROTECTED] Status: Open Bug Type:Feature/Change Request PHP Version: 4.4.2 New Comment: I also bumped into this, just voting. Previous Comments: [2006-07-20 07:04:21] [EMAIL PROTECTED] I would like to if anybody will explain how to port PHP functions into Unicode for dummies. It will also be nice to see an environment to monitor the changes (?trac) and control requirements. The last one is to help analyze deprecated, inconvenient and obscure API - logical bugs - to provide means to increase usability. Like unify inlcude_path delimiters on all platforms etc. It is just to save some time and make occasional development (which I am pretty restrained to) effective. [2006-07-20 06:27:56] [EMAIL PROTECTED] I evaluated PHP6 for a few days, but it was very far from being complete, unfortunately. I wonder why.. probably because it's still 12+ months before the release? =) Feel free to help us, though. The documentation is not the only area that needs some help =) [2006-07-19 17:34:30] [EMAIL PROTECTED] Well, i can't say this is ok for me. At first I thought that simple configure with --enable-mbstring=all should solve the problem, but it appeared that my host of dream already has this option turned on. So autodetection of russian language is just not enabled on code level, i.e. i18n support via mbstring is somehow crippled. I evaluated PHP6 for a few days, but it was very far from being complete, unfortunately. [2006-07-19 09:50:33] [EMAIL PROTECTED] Reclassified as feature request, where it belongs. techtonik, I'm sure you know email addresses of ext/mbstring maintainers and can contact them about it. Although, I don't think this will ever appear in PHP6 (because mbstring itself doesn't make much sense there) and it definitely won't appear in PHP4 (it's time to upgrade, eh?). [2006-07-19 09:41:58] [EMAIL PROTECTED] Description: Detection of russian encoding in mb_detect_encoding is disabled although it present among the list of supported encodings. It just three rather simple encodings - windows-1251, cp866 and koi8-r that spoil everyday life routines of russian programmer and make PHP less attractive for millions of potential PHP developers. I'll be grateful if somebody will care about them by providing default option for hosting providers, who are not too enthusiastic to experiment with server-wide configuration. Reproduce code: --- ?php $str = Íà÷àòà ðàáîòà íàä ðàçðàáîòêîé ñòðàíè÷êè ïðîåêòà. Êîä ñàéòà ñîäåðæèòñÿ â ìîäóëå farplugins íà CVS. Ñâåäåíèÿ îá îøèáêàõ è ïðåäëîæåíèÿ ìîæíî îñòàâëÿòü â òðýêåðàõ â êàòåãîðèè project website èëè â ñïèñêå ðàññûëêè farplugins-devel.; // $encoding = mb_detect_encoding($str, UTF-8, Windows-1251, CP866, KOI8-R); $encoding = mb_detect_encoding($str, array(UTF-8, Windows-1251, CP866, KOI8-R)); var_dump($encoding); Expected result: string(12) Windows-1251 Actual result: -- bool(false) -- Edit this bug report at http://bugs.php.net/?id=38138edit=1