Hi Matt,

It would be nice if scanner check for "#!" at start of file and skip the whole first line if found. Nothing else.

Of curse it won't save syscall itself.
Now shebang line is checked (in CGI SAPI code) before call to compiler and in case of scanner modification this code might be removed. So with opcode caches, which override the compiler routine, this syscall will be saved.

I'll review your constant notes/patches later today.

Thanks. Dmitry.

Matt Wilmas wrote:
Hi Dmitry,

----- Original Message -----
From: "Dmitry Stogov"
Sent: Sunday, July 27, 2008

Hi Matt,

At first as you are a scanner expert, I would like you to look into
another optimization idea.

Probably for historical reason PHP supports shebang lines
(#! /usr/bin/php) on top of php files. Especially to handle them PHP
(CGI/FastCGI/CLI) opens file and check for it. So even with opcode
caches FastCGI PHP does open syscall for the requested script, however
with opcode caches it's absolutely useless.

In case PHP scanner will handle shebang lines itself, we will able to
save this syscall.

Sorry, but now I'm the one who's confused here, since I have no idea what
I'm supposed to look into exactly. :-/  I know about shebang lines, and I
know there's a check in the scanner now to skip over it (must have been
somewhere else with flex...).  PHP itself doesn't "use" the shebang, does
it?  I don't have much knowledge of *nix system stuff, but I thought the
shebang is for the OS to use when asked to run an executable script...

So I don't understand what could be changed in the scanner to save the open
syscall -- since if the scanner is called, the file has already been opened,
right?  Again, sorry, but I'll need more explanation. :-)

I never had time and enough flex/re2c knowledge to implement this idea
myself. May be you'll able to look into the problem. In case you find a
simple solution we will able to do it in php-5.3.

Most PHP hosters and large sites use FastCGI with opcode caches (it is
also the primary way for MS Windows users), so this optimization is
really important.

[see below]

Yes, more reply below...

Matt Wilmas wrote:
Hi Dmitry,

I saw that you commited this patch, with the addition of only replacing
persistent constants (just mentioning for others on the list).  The
attached
patches have a few tweaks:

The main thing I noticed is that if "something" creates a persistent,
case-INsensitive constant (any of those in "standard" PHP, besides
TRUE/FALSE/NULL?), it could still wrongly be substituted.  My change
eliminates that possibility.
After yesterday's subbotnik I'm so stupid so cannot understand simple
tings. :)
Could you point me into the exact part of the patch.

Sure, I'll explain more.  If something (extension, etc...) creates a
persistent constant "foo" and it does NOT have the CONST_CS flag, and
capital "FOO" or whatever is used in a script (and a case-sensitive version
doesn't exist, of course), it will be lowercased, as you know, and "foo"
will be found, but that can't be used since a case-sensitive version may be
defined later (TRUE/FALSE/NULL are exceptions, but OK to do since they have
CONST_CT_SUBST).  Current zend_get_ct_const() behavior after finding
case-insensitive "foo":

        if ((c->flags & CONST_CS) && ...) {
            efree(lookup_name);
            return NULL;  /* Doesn't fail for "foo" since CONST_CS isn't
set...  */
.....
  if (c->flags & CONST_CT_SUBST) {
  return c;  /* This doesn't happen since CONST_CT_SUBST isn't set... */
 }
 if ((c->flags & CONST_PERSISTENT) &&
     ...) {
  return c;  /* This DOES happen since CONST_PERSISTENT is set */
 }

The last part is the problem, of course.  Substitution should only be done
for ones that have:

CONST_CT_SUBST
  or
CONST_PERSISTENT and whose case matches that used in the script (CONST_CS or
lowercase is used, e.g. "foo" in script is OK, not "FOO" or "Foo")

(BTW, that also implies that CT_SUBST is only useful for case-insensitive
constants (like T/F/N) and if something sets it with (CS | PERSISTENT), it
doesn't really do anything -- except not allow it to be used in "const ..."
declarations. :-P  Though that could be desired, by something, and/or to
also have substitution even in a namespace.)

With my change, after finding case-insensitive "foo":

        if ((c->flags & CONST_CT_SUBST) && !(c->flags & CONST_CS)) {
            efree(lookup_name);
            return c;  /* Happens for TRUE/FALSE/NULL */
        }
    }
    efree(lookup_name);
    return NULL;  /* Happens for "foo" before persistent check */

Hope that wasn't TOO verbose. :-O  I just swapped the logic (succeed vs
fail) of that top part so the else {  } block could be removed.

Checking Z_TYPE(c->value) != IS_CONSTANT[_ARRAY] isn't needed with the
persistent check now, is it?
I think you are right, but it's better to have this checks, because
nobody prohibit to create array constants in extensions.

OK. :-)

From orginal patch (zend_constants.c part), the memcmp(...) != 0 isn't
needed as it will always be true.  If it wasn't, the *first* hash lookup
wouldn't have failed. :-)  Like I said in the original message, it's old
code from a long time ago, but was never removed...
I'll check it with more clear head.

I looked through the CVS logs/versions for zend_constants.c again, and the
memcmp() was needed before v1.40 (6 years ago).  That's when all constants
were lowercased first, and then case was checked.  1.40 added case-sensitive
lookup first: memcmp() no longer necessary, as I said.  And this old code
was then copied to zend_get_ct_const(), etc.  That code area shouldn't get
hit very often or anything, but might as well remove since it's redundant!
:-)

Thanks. Dmitry.

- Matt


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to