Edit report at http://bugs.php.net/bug.php?id=49687&edit=1

 ID:                 49687
 Updated by:         cataphr...@php.net
 Reported by:        sird at rckc dot at
 Summary:            utf8_decode xml_utf8_decode vuln
 Status:             Assigned
 Type:               Bug
 Package:            *Unicode Issues
 Operating System:   *
 PHP Version:        5.2.11
-Assigned To:        scottmac
+Assigned To:        cataphract
 Block user comment: N

 New Comment:

Fixed for PHP 5.3 and trunk.


Previous Comments:
------------------------------------------------------------------------
[2010-10-27 20:13:26] cataphr...@php.net

Automatic comment from SVN on behalf of cataphract
Revision: http://svn.php.net/viewvc/?view=revision&revision=304959
Log: - Fixed bug #49687 (utf8_decode vulnerabilities and deficiencies in
the number
  of reported malformed sequences). (Gustavo)
#Made a public interface for get_next_char/utf-8 in trunk to use in
utf8_decode.
#In PHP 5.3, trunk's get_next_char was copied to xml.c because 5.3's
#get_next_char is different and is not prepared to recover appropriately
from
#errors.

------------------------------------------------------------------------
[2009-10-16 04:53:00] sird at rckc dot at

My last post, I promise..



it should say:

        c = ((s[0]&63)<<6) | (s[1]&63);



Greetz!

------------------------------------------------------------------------
[2009-10-16 04:52:21] sird at rckc dot at

Oh, duh! I'm reading the wrong function.. :( Sorry



                        if(pos-2 >= 0 || s[1]&0xC0!=0x80) {

                                c = ((s[0]&7)<<18) | ((s[1]&63)<<12) | 
((s[2]&63)<<6) | (s[3]&63);

                        } else {

                                c = '?';        

                        }

------------------------------------------------------------------------
[2009-10-16 04:45:25] sird at rckc dot at

oh, my mistake:

                else if (c < 0x800) {

                        newbuf[(*newlen)++] = (0xc0 | (c >> 6));

                        newbuf[(*newlen)++] = (0x80 | (c & 0x3f));

                }



should be:



                else if (c < 0x800) {

                        if ( (s[1]&0xC0!=0x80) ){

                            newbuf[(*newlen)++] = '?';

                        }else{

                            newbuf[(*newlen)++] = (0xc0 | (c >> 6));

                            newbuf[(*newlen)++] = (0x80 | (c & 0x3f));

                        }

                }

------------------------------------------------------------------------
[2009-10-16 04:41:27] sird at rckc dot at

I disagree.. how slow can it be to add 2 bit operations..



} else if (c < 0x800) {



change to



} else if (c < 0x800) {

    if ( (s[1]&0xC0!=0x80) ){  // this is a new operation

        newbuf[(*newlen)++] = '?'; // this are not new operations

        pos--; // this are not new operations

        s++; // this are not new operations

        continue;

    }

}



Besides, considering all real implementations do what the spec say they
should do (it's not validate it's valid UNICODE, is that UNICODE says
that the algorithm SHOULD do the check).. not doing it on PHP is just
nuts.

------------------------------------------------------------------------


The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at

    http://bugs.php.net/bug.php?id=49687


-- 
Edit this bug report at http://bugs.php.net/bug.php?id=49687&edit=1

Reply via email to