On Sat, May 5, 2012 at 5:31 AM, C.Koy <can5...@gmail.com> wrote:

>
> I've been experimenting with bare-bones PHP I've built from pristine
> sources so far. Don't you think you should do the same, in dealing with
> such a  bug?
>

My personal system is a BSD derivative; the Turkish locales on these use
latin rather than Turkish case conversion (and installing a proper Turkish
locale is a mess), so I've been testing on another system. I've been
hesitant to use its resources too heavily for professional reasons. Running
a small PHP script is one thing; though time and space required for a PHP
build isn't large on modern systems, I can't justify doing so since it's
not directly related to site operations.

On Sat, May 5, 2012 at 8:59 AM, Wim Wisselink <w...@powerassist.nl> wrote:

> Try to var_dump the setLocale and see if it return the specified locale or
> just 'false'. If false try the following:
>
> setlocale(LC_ALL, 'tr_TR.UTF-8');
>

I had previously tested the locale by using "setlower('I')", as it tests
both that the locale exists and uses Turkish-langage case conversion. The
systems where I tested C.Koy's script passed the "setlower" test. Turned
out to be the Zend optimizer that prevented the error. With it not loaded,
the example script failed with a "Fatal error: Call to undefined function
IJK()" error message.

Here's a breakdown:

In both PHP 5.2 and 5.3, calling a function before defining it results in a
dynamic call (INIT_FCALL_BY_NAME+DO_FCALL_BY_NAME). Here's the PHP 5.2 dump
of C.Koy's example:

  line     # *  op                           fetch          ext  return
 operands

---------------------------------------------------------------------------------
     2     0  >   FETCH_CONSTANT                                   ~0
 'LC_CTYPE'
           1      SEND_VAL
~0
           2      SEND_VAL
'tr_TR'
           3      DO_FCALL                                      2
 'setlocale'
     3     4      INIT_FCALL_BY_NAME
'IJK'
           5      DO_FCALL_BY_NAME                              0
     4     6      NOP
     5     7    > RETURN                                                   1
           8*   > ZEND_HANDLE_EXCEPTION

Here's the 5.3 dump:
  line     # *  op                           fetch          ext  return
 operands

---------------------------------------------------------------------------------
     2     0  >   EXT_STMT
           1      EXT_FCALL_BEGIN
           2      SEND_VAL                                                 2
           3      SEND_VAL
'tr_TR'
           4      DO_FCALL                                      2
 'setlocale'
           5      EXT_FCALL_END
     3     6      EXT_STMT
           7      INIT_FCALL_BY_NAME
'ijk', 'IJK'
           8      EXT_FCALL_BEGIN
           9      DO_FCALL_BY_NAME                              0
          10      EXT_FCALL_END
     4    11      EXT_STMT
          12      NOP
     5    13    > RETURN                                                   1

>From line 7 in the 5.3 dump, we see 5.3 converts the function name to
lowercase during compilation, but 5.2 doesn't. Examining the source
confirms this: you can see the lowercase conversion in 5.3's
zend_do_begin_dynamic_function_call on lines 1659 (for namespaced calls)
and 1683 (for non-namespaced calls) of zend_compile.c (
http://svn.php.net/viewvc/php/php-src/branches/PHP_5_3_10/Zend/zend_compile.c?revision=323023&view=markup#l1683),
while there's no such conversion in the same function in 5.2 (
http://svn.php.net/viewvc/php/php-src/branches/PHP_5_2/Zend/zend_compile.c?view=markup&pathrev=302150#l1450
).

5.3 only performs case conversion if the function name is a CONST
expression, which is why defining the function after calling it works but
calling a function with a variable name breaks. Correspondingly, the
ZEND_INIT_FCALL_BY_NAME_SPEC_*_HANDLER (in zend_vm_execute.h) uses the
first operand (which is already lowercased), while the other
INIT_FCALL_BY_NAME opcode handlers (ZEND_INIT_FCALL_BY_NAME_SPEC_*_HANDLER)
use the second, non-lowercased operand.

The 5.2 INIT_FCALL_BY_NAME opcode handlers only ever use the second,
un-lowercased operand.

So, what does this mean for fixing the bug? Not so much when the function
or class is stored in a variable, since these can't be converted to
lowercase at compile time without converting all variables, which is too
wasteful of both time and space (as both the unconverted and converted
strings would need to be stored). For object instantiation,
zend_do_begin_new_object gets the class name ultimately from the
namespace_name rule. zend_do_begin_new_object could then take the resulting
znode and create a second, lowercased copy, storing it as the second
operand. ZEND_NEW_SPEC_HANDLER would then be altered to use the second
operand (if not UNUSED) to instantiate the object. This certainly seems a
valid alternative to a lowercasing version of the namespace_name rule; it's
not as far reaching, which may be good (in that it has less impact) and bad
(in that there may be other instances of this bug that it won't fix).

However, neither the dual-operand solution nor lc_namespace_name will fix
the bug when the identifier is stored in a variable. That requires fixing
the run-time portion of PHP, in particular zend_fetch_class (or
zend_do_begin_class_member_function_call, zend_do_begin_new_object and
likely others) and the INIT_FCALL_BY_NAME handlers.

I get the feeling that there are still other cases yet to be discovered
where this bug surfaces.

Reply via email to