Edit report at http://bugs.php.net/bug.php?id=49687&edit=1
ID: 49687 Updated by: cataphr...@php.net Reported by: sird at rckc dot at Summary: utf8_decode xml_utf8_decode vuln Status: Assigned Type: Bug Package: *Unicode Issues Operating System: * PHP Version: 5.2.11 -Assigned To: scottmac +Assigned To: cataphract Block user comment: N New Comment: Fixed for PHP 5.3 and trunk. Previous Comments: ------------------------------------------------------------------------ [2010-10-27 20:13:26] cataphr...@php.net Automatic comment from SVN on behalf of cataphract Revision: http://svn.php.net/viewvc/?view=revision&revision=304959 Log: - Fixed bug #49687 (utf8_decode vulnerabilities and deficiencies in the number of reported malformed sequences). (Gustavo) #Made a public interface for get_next_char/utf-8 in trunk to use in utf8_decode. #In PHP 5.3, trunk's get_next_char was copied to xml.c because 5.3's #get_next_char is different and is not prepared to recover appropriately from #errors. ------------------------------------------------------------------------ [2009-10-16 04:53:00] sird at rckc dot at My last post, I promise.. it should say: c = ((s[0]&63)<<6) | (s[1]&63); Greetz! ------------------------------------------------------------------------ [2009-10-16 04:52:21] sird at rckc dot at Oh, duh! I'm reading the wrong function.. :( Sorry if(pos-2 >= 0 || s[1]&0xC0!=0x80) { c = ((s[0]&7)<<18) | ((s[1]&63)<<12) | ((s[2]&63)<<6) | (s[3]&63); } else { c = '?'; } ------------------------------------------------------------------------ [2009-10-16 04:45:25] sird at rckc dot at oh, my mistake: else if (c < 0x800) { newbuf[(*newlen)++] = (0xc0 | (c >> 6)); newbuf[(*newlen)++] = (0x80 | (c & 0x3f)); } should be: else if (c < 0x800) { if ( (s[1]&0xC0!=0x80) ){ newbuf[(*newlen)++] = '?'; }else{ newbuf[(*newlen)++] = (0xc0 | (c >> 6)); newbuf[(*newlen)++] = (0x80 | (c & 0x3f)); } } ------------------------------------------------------------------------ [2009-10-16 04:41:27] sird at rckc dot at I disagree.. how slow can it be to add 2 bit operations.. } else if (c < 0x800) { change to } else if (c < 0x800) { if ( (s[1]&0xC0!=0x80) ){ // this is a new operation newbuf[(*newlen)++] = '?'; // this are not new operations pos--; // this are not new operations s++; // this are not new operations continue; } } Besides, considering all real implementations do what the spec say they should do (it's not validate it's valid UNICODE, is that UNICODE says that the algorithm SHOULD do the check).. not doing it on PHP is just nuts. ------------------------------------------------------------------------ The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at http://bugs.php.net/bug.php?id=49687 -- Edit this bug report at http://bugs.php.net/bug.php?id=49687&edit=1