Edit report at http://bugs.php.net/bug.php?id=51385&edit=1

 ID:               51385
 User updated by:  baudav at gmail dot com
 Reported by:      baudav at gmail dot com
 Summary:          htmlentities next substr with UTF-8
 Status:           Bogus
 Type:             Bug
 Package:          *Unicode Issues
 Operating System: W2k3 IIS6
 PHP Version:      5.3.2 vc9-nts

 New Comment:

Oh! excuse for my incomplet report! Tested with substr and mb_substr;
It's same with mb_string


Previous Comments:
------------------------------------------------------------------------
[2010-03-25 05:27:58] baudav at gmail dot com

Windows 2003 with IIS6 fastcgi; PHP 5.3.1 or 5.3.2 vc9-nts

------------------------------------------------------------------------
[2010-03-25 05:24:44] ahar...@php.net

Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

Like most PHP functions, substr() is not multibyte-aware. You may prefer
to use 

mb_substr() instead.

------------------------------------------------------------------------
[2010-03-25 05:18:11] baudav at gmail dot com

Description:
------------
substr not truncate UTF-8 correctly, and generate bad UTF-8 string.



test script must be writen in UTF-8

Test script:
---------------
<?php

$str = 'câble TOSLink mâle/mâle (1.5 à 25m)';

$etc = '...';



echo htmlentities(substr($str, 0, 33). $etc, ENT_QUOTES, 'UTF-8')



?>

Expected result:
----------------
câble TOSLink mâle/mâle (1.5 ...

Actual result:
--------------
no return, just PHP error logged: 



PHP Warning:  htmlentities(): Invalid multibyte sequence in argument in
C:\DATA\WWW\test.php on line 5



change substr($str, 0, 33) by substr($str, 0, 32), it's work


------------------------------------------------------------------------



-- 
Edit this bug report at http://bugs.php.net/bug.php?id=51385&edit=1

Reply via email to