[PHP-DEV] Updated RFC: Zend Signal Handling

2009-11-20 Thread shire


Lucas and I have re-visited the Zend Signal Handling RFC and have updated the 
patches for both the 5.3 branch and Trunk:

http://wiki.php.net/rfc/zendsignals

We updated the patches to deal with a bug fix that allows timeouts within the 
user space shutdown functions, previously multiple timeouts would not occur.  
To deal with this we reset the signal handlers on a timeout so that the user 
shutdown function can once again get a signal and behave accordingly.

Per the Chicago developer meeting, it was determined that the windows timeouts 
are not asynchronous and thus there isn't currently any special handling 
required for critical sections on the windows builds.

We would like to hear about any other issues/feedback with this patch, and if 
there's no objections get this checked in as it will assist with getting more 
stability for signal handling (specific example is the use of spin-locks with 
APC).

Thanks!,

-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] RFC: php_config --extra_includes

2009-10-11 Thread shire


Heya,

Pierre Joye wrote:

hi!

On Sun, Oct 11, 2009 at 10:05 AM, shire  wrote:

Currently we have an open bug #16809
(http://pecl.php.net/bugs/bug.php?id=16809&edit=1) in APC where we're unable
to find the pcre.h header file as it's not in our include path when a user
compiles PHP with the --enable-pcre=.  This is because
the include path the user provides isn't made available in INCLUDES in the
extension (APC) when compiled as a shared library using phpize.

I've come up with the following solution, but want feedback/other options
before I commit.  Essentially I've made use of the EXTRA_INCLUDES that's
included in INCLUDES, but not really used much of anywhere AFAIK.  I've also
added the autoconf macro PHP_ADD_EXTRA_INCLUDE that mirrors the
PHP_ADD_INCLUDE behavior, as well as the --extra_includes option to
php-config.  This way extensions can provide additional include paths that
they depend on for extensions that may in-turn be dependent on their header
files.

http://tekrat.com/downloads/bits/extra_includes.php5.3.patch


Would it be not more correct to test if the bundled pcre is used or
not. And detect the pcre header if the system pcre is used?



We can definitely check to see if PCRE is bundled or not, but how do we detect 
the exact path of the pcre.h file that was used?  (we could also have multiple 
pcre.h files on the same system), so it seems we need to store this somewhere 
in the PHP configuration that the shared extension can pull from.  My thinking 
here was that just making it part of the existing INCLUDE path was easiest, we 
could probably make this some sort of macro that the extension has to 
explicitly call to get this path included as well.  Perhaps that's more ideal 
so we don't get a bunch of extra include paths we don't actually need for other 
extensions?

I'm not sure I follow the openssl/libxml examples entirely, as this case would 
be us depending on having a way for an extension like APC to get the location 
of openssl includes/lib paths.  Do we have that scenario somewhere?

Thanks!

-shire




--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] RFC: php_config --extra_includes

2009-10-11 Thread shire


Currently we have an open bug #16809 
(http://pecl.php.net/bugs/bug.php?id=16809&edit=1) in APC where we're unable to find 
the pcre.h header file as it's not in our include path when a user compiles PHP with the 
--enable-pcre=.  This is because the include path the user 
provides isn't made available in INCLUDES in the extension (APC) when compiled as a 
shared library using phpize.

I've come up with the following solution, but want feedback/other options 
before I commit.  Essentially I've made use of the EXTRA_INCLUDES that's 
included in INCLUDES, but not really used much of anywhere AFAIK.  I've also 
added the autoconf macro PHP_ADD_EXTRA_INCLUDE that mirrors the PHP_ADD_INCLUDE 
behavior, as well as the --extra_includes option to php-config.  This way 
extensions can provide additional include paths that they depend on for 
extensions that may in-turn be dependent on their header files.

http://tekrat.com/downloads/bits/extra_includes.php5.3.patch

Interested in hearing other options on how this can be corrected, or feedback 
on this approach...


Thanks,

-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] svn checkout suggestion

2009-07-17 Thread shire
Davey Shafik wrote:  

More importantly, the branch/merge support in SVN is limited to
temporary feature/bug branches. You branch, *complete* the feature/bug
fix, and then merge it in. After that, if you decide to carry on in your
branch, SVN's merge tracking cannot handle the tracking of changes. You
need to merge, delete, re-branch, carry on. This doesn't work for say,
the 5.3.x branch fixes getting pushed into trunk, the 5.3.x branch is
toast as far as SVN is concerned once you merge the first fix — quite
useless.


I didn't realize this was the what was being described in documentation.

"In Subversion 1.5, once a --reintegrate merge is done from branch to trunk, the 
branch is no longer usable for further work. It's not able to correctly absorb new trunk 
changes, nor can it be properly reintegrated to trunk again. For this reason, if you want 
to keep working on your feature branch, we recommend destroying it and then re-creating 
it from the trunk:"

That's pretty limiting, although I don't see why you can't work around this by 
cherry-picking your changes out and only merging those back to trunk.  This of 
course isn't really ideal.

-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] svn checkout suggestion

2009-07-16 Thread shire

Pierre Joye wrote:

On Fri, Jul 17, 2009 at 12:45 AM, shire  wrote:

Jani Taskinen wrote:

Rasmus Lerdorf wrote:

One of the benefits of svn is that we can do cross-branch commit pretty
easily now and thus avoid multiple similar commits with annoying MFH/MFB
commit log messages that are hard to track.

I did a commit today on all 3 branches and it worked fine except for the
fact that no commit email was sent anywhere. I sent Gwynne a note about
it, but so that everyone knows too, don't commit anything like that
until this is fixed..


Do we have a long-term plan of using actual merge commands/tools to merge
our branches rather than duplicating commits or manually merging?  I think
this could speed up development and allow us to have more control over
releases, versions, etc.  I've seen cases in the past where changes fall
through the cracks because they didn't get manually merged up/down.  The
ability to merge complete branches as a branch rather than many different
commits could save some hastle assuming everyone follows the same
commit/merge patterns.


I doubt any tools can really help to automatically help to merge
changes from one branch to another (in php). However there are many
very good merging tools (with UI) out there to ease this process
(meld, winmerge, etc.).



I don't see why this wouldn't be possible for PHP, but it would likely require 
us changing our current methodology for creating branches.  Currently we make a 
branch and never merge it back, this is a bit different than what's seen in 
some other repositories.  Specifically, we'd need to look at creating more 
branches for specific features and merging those back to a central version or 
trunk depending on the release it's planned to go out with.

One downside to this is that you usually need to have someone making sure things get 
synced back and forth (probably a RM), but this can be useful for keeping a development 
version relatively stable and encouraging cooperation between developers.  Once a release 
goes out this stable code can then be merged back up-stream into the next major release 
or trunk etc.  Obviously there's lots of details in here about how it can be structured 
and work, but I'm mostly asking about the high level "is this the direction we want 
to go".

This would basically mean you could make as many single commits as necessary for whatever 
you're working from without adversely affecting a lot of other branches or having to make 
up multiple commits for each bug.  The downside being that to keep conflicts simple we 
should merge and merge often, and this could require the developer to do the 
"merge" on each commit or the RM doing a series of them and resolving any 
possible commits themselves.  The first would be closer to what we do now, but would be 
simpler and allow merge-tracking.  Would be interesting to know how some other large FOSS 
projects are handling this as well and what those experiences have been like.  We also 
can't really do this for our current branches, it would probably need to be for all 
future versions.

-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] svn checkout suggestion

2009-07-16 Thread shire

Jeff Griffiths wrote:

On 16/07/09 3:45 PM, shire wrote:
...

Do we have a long-term plan of using actual merge commands/tools to
merge our branches rather than duplicating commits or manually merging?
I think this could speed up development and allow us to have more
control over releases, versions, etc. I've seen cases in the past where
changes fall through the cracks because they didn't get manually merged
up/down. The ability to merge complete branches as a branch rather than
many different commits could save some hastle assuming everyone follows
the same commit/merge patterns.


I'd suggest looking into the svnmerge tool:

http://www.orcaware.com/svn/wiki/Svnmerge.py

We use it quite a bit at ActiveState to migrate changes from developer
branches into trunk, and then into production. I would caution that the
merge actions might not be to everyone's taste, so you might want to
either encourage use or ban use depending on your preference.

We have some quite excellent Komodo-specific documentation a co-worker
wrote on how to use this tool for developer branches specifically; I
could see about providing this to someone if they want to try svnmerge out.



This sounds very similar to what the latest versions of SVN has already done, 
specifically merge tracking.  You should now be able to use the 'svn merge' 
command and have it track changes for you across branches so you don't have to 
worry about merging the same rev twice etc.

-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] svn checkout suggestion

2009-07-16 Thread shire

Jani Taskinen wrote:

Rasmus Lerdorf wrote:

One of the benefits of svn is that we can do cross-branch commit pretty
easily now and thus avoid multiple similar commits with annoying MFH/MFB
commit log messages that are hard to track.


I did a commit today on all 3 branches and it worked fine except for the
fact that no commit email was sent anywhere. I sent Gwynne a note about
it, but so that everyone knows too, don't commit anything like that
until this is fixed..



Do we have a long-term plan of using actual merge commands/tools to merge our 
branches rather than duplicating commits or manually merging?  I think this 
could speed up development and allow us to have more control over releases, 
versions, etc.  I've seen cases in the past where changes fall through the 
cracks because they didn't get manually merged up/down.  The ability to merge 
complete branches as a branch rather than many different commits could save 
some hastle assuming everyone follows the same commit/merge patterns.

-shire  


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] 5.3.0RC3

2009-06-03 Thread shire


Lukas Kahwe Smith wrote:

Aloha,

Well we have waited long enough for a fix for re2c, but Johannes and I
(based on Scott's advice) have now decided that we can live with the
current hack that simply reverts to PHP 5.2 behavior (slower parsing,
only relevant for people not using a byte code cache) when re2c fails to
handle things. Derick is looking into this issue, so we might have a fix
in time.




I've been holding on to this patch until Derick can take a look at re2c to see 
if that would be a preferred fix, however here is my patch to make yyfill() 
realloc for the end-of-buffer scanning.  This also fixes the ini scanner in the 
same fashion, and adds some length arguments for parsing functions.


diff --git a/Zend/zend_globals.h b/Zend/zend_globals.h
index efda001..833617f 100644
--- a/Zend/zend_globals.h
+++ b/Zend/zend_globals.h
@@ -271,6 +271,9 @@ struct _zend_ini_scanner_globals {
unsigned char *yy_limit;
int yy_state;
zend_stack state_stack;
+   unsigned char *buf;
+   unsigned char *buf_limit;
+   size_t buf_offset;
 
 	char *filename;

int lineno;
@@ -291,7 +294,10 @@ struct _zend_php_scanner_globals {
unsigned char *yy_limit;
int yy_state;
zend_stack state_stack;
-   
+   unsigned char *buf;
+   unsigned char *buf_limit;
+   size_t buf_offset;
+
 #ifdef ZEND_MULTIBYTE
/* original (unfiltered) script */
unsigned char *script_org;
diff --git a/Zend/zend_ini.h b/Zend/zend_ini.h
index 23e89da..6985924 100644
--- a/Zend/zend_ini.h
+++ b/Zend/zend_ini.h
@@ -197,7 +197,7 @@ END_EXTERN_C()
 typedef void (*zend_ini_parser_cb_t)(zval *arg1, zval *arg2, zval *arg3, int 
callback_type, void *arg TSRMLS_DC);
 BEGIN_EXTERN_C()
 ZEND_API int zend_parse_ini_file(zend_file_handle *fh, zend_bool 
unbuffered_errors, int scanner_mode, zend_ini_parser_cb_t ini_parser_cb, void 
*arg TSRMLS_DC);
-ZEND_API int zend_parse_ini_string(char *str, zend_bool unbuffered_errors, int 
scanner_mode, zend_ini_parser_cb_t ini_parser_cb, void *arg TSRMLS_DC);
+ZEND_API int zend_parse_ini_string(char *str, int len, zend_bool 
unbuffered_errors, int scanner_mode, zend_ini_parser_cb_t ini_parser_cb, void 
*arg TSRMLS_DC);
 END_EXTERN_C()
 
 /* INI entries */

diff --git a/Zend/zend_ini_parser.y b/Zend/zend_ini_parser.y
index 42e292e..c237581 100644
--- a/Zend/zend_ini_parser.y
+++ b/Zend/zend_ini_parser.y
@@ -218,7 +218,7 @@ ZEND_API int zend_parse_ini_file(zend_file_handle *fh, 
zend_bool unbuffered_erro
 
 /* {{{ zend_parse_ini_string()

 */
-ZEND_API int zend_parse_ini_string(char *str, zend_bool unbuffered_errors, int 
scanner_mode, zend_ini_parser_cb_t ini_parser_cb, void *arg TSRMLS_DC)
+ZEND_API int zend_parse_ini_string(char *str, int len, zend_bool 
unbuffered_errors, int scanner_mode, zend_ini_parser_cb_t ini_parser_cb, void 
*arg TSRMLS_DC)
 {
int retval;
zend_ini_parser_param ini_parser_param;
@@ -227,7 +227,7 @@ ZEND_API int zend_parse_ini_string(char *str, zend_bool 
unbuffered_errors, int s
ini_parser_param.arg = arg;
CG(ini_parser_param) = &ini_parser_param;
 
-	if (zend_ini_prepare_string_for_scanning(str, scanner_mode TSRMLS_CC) == FAILURE) {

+   if (zend_ini_prepare_string_for_scanning(str, len, scanner_mode 
TSRMLS_CC) == FAILURE) {
return FAILURE;
}
 
diff --git a/Zend/zend_ini_scanner.h b/Zend/zend_ini_scanner.h

index cef499f..30b6103 100644
--- a/Zend/zend_ini_scanner.h
+++ b/Zend/zend_ini_scanner.h
@@ -30,7 +30,7 @@ BEGIN_EXTERN_C()
 int zend_ini_scanner_get_lineno(TSRMLS_D);
 char *zend_ini_scanner_get_filename(TSRMLS_D);
 int zend_ini_open_file_for_scanning(zend_file_handle *fh, int scanner_mode 
TSRMLS_DC);
-int zend_ini_prepare_string_for_scanning(char *str, int scanner_mode 
TSRMLS_DC);
+int zend_ini_prepare_string_for_scanning(char *str, int len, int scanner_mode 
TSRMLS_DC);
 int ini_lex(zval *ini_lval TSRMLS_DC);
 void shutdown_ini_scanner(TSRMLS_D);
 END_EXTERN_C()
diff --git a/Zend/zend_ini_scanner.l b/Zend/zend_ini_scanner.l
index 4485bec..33a0f10 100644
--- a/Zend/zend_ini_scanner.l
+++ b/Zend/zend_ini_scanner.l
@@ -37,9 +37,7 @@
 #include "zend_ini_scanner_defs.h"
 
 #define YYCTYPE   unsigned char

-/* allow the scanner to read one null byte after the end of the string (from 
ZEND_MMAP_AHEAD)
- * so that if will be able to terminate to match the current token (e.g. 
non-enclosed string) */
-#define YYFILL(n) { if (YYCURSOR > YYLIMIT) return 0; }
+#define YYFILL(n) { if ((YYCURSOR + n) >= YYLIMIT) yy_fill(n); }
 #define YYCURSOR  SCNG(yy_cursor)
 #define YYLIMIT   SCNG(yy_limit)
 #define YYMARKER  SCNG(yy_marker)
@@ -57,13 +55,6 @@
 #define yyless(x)YYCURSOR = yytext + x
 /* #define yymore() goto yymore_restart */
 
-/* perform sanity check. If this message is triggered you should

-   increase the ZEND_MMAP_AHEAD value in the zend_streams.h file */
-/*!max:re2c */
-#if ZEND_MMAP_AHEAD < (YYMAXFILL + 1)
-# error ZEND_MMAP_AHEAD

Re: [PHP-DEV] Removing zend_hash_func in PHP 6.0

2009-06-02 Thread shire

Cristian Rodríguez wrote:

On 26/05/09 08:36, Antony Dovgal wrote:


12:48<@tony2001> you can remove it, but leave a macro to make sure old
code still works


or just use __attribute__((alias(""))); ...



How well is that supported on different compilers/architectures?  GCC says it's "Not 
all target machines support this attribute.".  Probably safer to user macros for 
compliance.  ;-)

-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Removing zend_hash_func in PHP 6.0

2009-06-02 Thread shire

Lukas Kahwe Smith wrote:


On 30.05.2009, at 01:29, shire wrote:


Cristian Rodríguez wrote:

David Soria Parra escribió:

Hi List,

I recently discovered that zend_hash_func is equal to
zend_get_hash_value. To clean this up, I would like to remove
zend_hash_func in favor of zend_get_hash in HEAD. If there are no
objections I would commit a patch in a few days.


We can start with the attached patch and then remove it in the future...


Yeah, althoughI would suggest we put this patch as a TODO for php-5.4,
and then completely remove it in 6



just FYI ..
from all i have heard from people, the next version will be 6.0, there
might be a 5.4 after 6.0 that back ports some stuff and more importantly
adds soem forwards compatibility. but for now lets focus on 6.0 (well
after relasing 5.3).


Ok Right, I think there was some discussion around them needing a php-5.4 
release prior to php-6 but either way, because of how we do our branching 
currently I assume it'll be easier to do this there *if* it's something that 
does in fact happen.  So maybe this should be noted somewhere for when we are 
sure on that?

-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Removing zend_hash_func in PHP 6.0

2009-05-29 Thread shire

Cristian Rodríguez wrote:

David Soria Parra escribió:

Hi List,

I recently discovered that zend_hash_func is equal to
zend_get_hash_value. To clean this up, I would like to remove
zend_hash_func in favor of zend_get_hash in HEAD. If there are no
objections I would commit a patch in a few days.


We can start with the attached patch and then remove it in the future...


Yeah, althoughI would suggest we put this patch as a TODO for php-5.4, and then 
completely remove it in 6

-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Removing zend_hash_func in PHP 6.0

2009-05-26 Thread shire

Antony Dovgal wrote:

On 26.05.2009 16:13, David Soria Parra wrote:

Hi List,

I recently discovered that zend_hash_func is equal to
zend_get_hash_value. To clean this up, I would like to remove
zend_hash_func in favor of zend_get_hash in HEAD. If there are no
objections I would commit a patch in a few days.


12:48<@tony2001>  dsp__: I see no point in removing it completely and breaking 
BC for no reason
12:48<@tony2001>  you can remove it, but leave a macro to make sure old code 
still works



I would like to see us clean up items like this as part of the PHP-6 release, 
part of the list of todo items was to get rid of any bits like this that can be 
unnecessarily confusing etc.

I do think it should be done in a way that deprecates one of the functions, but 
not the other to make migration easier for extensions (unless we have a 
compelling reason for a complete rename).

-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [PATCH] Scanner "diet" with fixes, etc.

2009-05-04 Thread shire

Matt Wilmas wrote:


Gotcha. If something changes, YYFILL -- or something to handle what
needs to be done -- could just be added to the manual parts as
necessary, right?



Sorry forget to reply on this one, but yeah we'd have to do a manual call to 
YYFILL or a check or whatever we come up with wherever we're scanning ahead 
manually.  (there's probably some other parts that might need this as well).

-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [PATCH] Scanner "diet" with fixes, etc.

2009-05-04 Thread shire

Matt Wilmas wrote:

Hi Brian,

- Original Message -
From: "shire"
Sent: Monday, May 04, 2009



Hey Matt,

Matt Wilmas wrote:


+/* To save initial string length after scanning to first variable,
CG(doc_comment_len) can be reused */
+#define double_quotes_scanned_len CG(doc_comment_len)
+


(minor) Maybe we should rename this var if we're going to use it for
other
purposes, this doesn't really save any typing. Also if we do want the
define maybe we should upper case it so it's more obvious?


Yeah, I tried to think of other ways to do it, but just left it trying
to look like another variable (not to save typing). Well, it can easily
be changed later if a "cleaner" way is decided...



Yeah I would just prefer if it was more obvious that it is *not* a
variable ;-)


How about this?

#define SET_DOUBLE_QUOTES_SCANNED_LENGTH(len) CG(doc_comment_len) = (len)
#define GET_DOUBLE_QUOTES_SCANNED_LENGTH() CG(doc_comment_len)



Sure, works for me ;-)



+ while (YYCURSOR < YYLIMIT) {
+ switch (*YYCURSOR++) {


In the example above, which we have a couple examples of here, we don't
obey the YYFILL macro to detect if we have exceeded our EOF. This
*might* be a problem, but only really depends on if we intend to use
the
YYFILL as a solution for exceeding our mmap bounds.


I don't understand what the problem might be? The YYCURSOR < YYLIMIT
check is what the YYFILL has been doing. If you mean after changes
later, as long as the the whole thing is mmap()'d (which I'm assuming
would be the case?), it just "looks" like a standard string, with
terminating '\0', right? And there's no reading past YYLIMIT.


Sorry yeah this wouldn't be a problem currently, but only if we try to
fix the mmap issue by using YYFILL to realloc more space into the buffer.
Then that macro would change to something more complicated. (per my
previous replies with Arnaud)


Gotcha. If something changes, YYFILL -- or something to handle what
needs to be done -- could just be added to the manual parts as
necessary, right?


Have you considered using the lexer STATES and regex's instead of the
manual C code for scanning the rest. It seems like if we have a one-char
regex match for what the C code is doing we could handle this in the
lexer without a lot of manual intervention (need to look at it more, just
a thought I had earlier, the expressions are clearer now with your patch
applied) ;-)


It seems that matching one-char-at-a-time with re2c would be more
complicated than the manual way, not to mention slower than the old
(current) way.

Do you have any objection (well, you've kinda mentioned some :-)) if I'd
commit the changes in a little while like Dmitry thought could be done?


Well I'm wondering if something more along these lines (just did this on-top of 
your patch as you cleaned up a lot) might be more appealing.  (I'm not sure how 
much slower this would be than the current implementation, obviously it'll be 
somewhat slower, I'm basically just doing what you did in C but in the scanner 
instead of course).

"#"|"//" {
BEGIN(ST_EOL_COMMENT);
yymore();
}

({NEWLINE}|"%>"|"?>") {
char tmp = *(YYCURSOR-1);
if ((tmp == '%' && CG(asp_tags)) | tmp == '?') {
  YYCURSOR -= 2;
}
CG(zend_lineno)++;
BEGIN(ST_IN_SCRIPTING);
return T_COMMENT;
}

{ANY_CHAR} {
if (YYCURSOR >= YYLIMIT) {
  BEGIN(ST_IN_SCRIPTING);
  return T_COMMENT;
}
yymore();
}



Let me know what the thoughts are on the above, if we don't want that then I 
say yeah, commit away!

-shire





--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [PATCH] Scanner "diet" with fixes, etc.

2009-05-04 Thread shire


Hey Matt,

Matt Wilmas wrote:


+/* To save initial string length after scanning to first variable,
CG(doc_comment_len) can be reused */
+#define double_quotes_scanned_len CG(doc_comment_len)
+


(minor) Maybe we should rename this var if we're going to use it for
other
purposes, this doesn't really save any typing. Also if we do want the
define maybe we should upper case it so it's more obvious?


Yeah, I tried to think of other ways to do it, but just left it trying
to look like another variable (not to save typing). Well, it can easily
be changed later if a "cleaner" way is decided...



Yeah I would just prefer if it was more obvious that it is *not* a variable ;-)


+ while (YYCURSOR < YYLIMIT) {
+ switch (*YYCURSOR++) {


In the example above, which we have a couple examples of here, we don't
obey the YYFILL macro to detect if we have exceeded our EOF. This
*might* be a problem, but only really depends on if we intend to use the
YYFILL as a solution for exceeding our mmap bounds.


I don't understand what the problem might be? The YYCURSOR < YYLIMIT
check is what the YYFILL has been doing. If you mean after changes
later, as long as the the whole thing is mmap()'d (which I'm assuming
would be the case?), it just "looks" like a standard string, with
terminating '\0', right? And there's no reading past YYLIMIT.


Sorry yeah this wouldn't be a problem currently, but only if we try to fix the 
mmap issue by using YYFILL to realloc more space into the buffer.  Then that 
macro would change to something more complicated. (per my previous replies with 
Arnaud)

Have you considered using the lexer STATES and regex's instead of the manual C 
code for scanning the rest.  It seems like if we have a one-char regex match 
for what the C code is doing we could handle this in the lexer without a lot of 
manual intervention (need to look at it more, just a thought I had earlier, the 
expressions are clearer now with your patch applied) ;-)

-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [PATCH] Scanner "diet" with fixes, etc.

2009-05-04 Thread shire

Arnaud Le Blanc wrote:

Hi,
On Mon, May 4, 2009 at 9:36 AM, shire  wrote:

Regarding the ZEND_MMAP_AHEAD issue and the temp. fix that Dmitry put in we
need to find a solution to that, perhaps I can play with that this week too
as I think I'm seeing some related issues in my testing of 5.3.  Essentially
we abuse ZEND_MMAP_AHEAD by adding it to the file size and passing it to the
mmap call which isn't at all valid and only really works up to PAGESIZE.  We
could possibly use YYFILL to re-allocate more space as necessary past the
end of file to fix this.


I was thinking of doing something like that with YYFILL too. However
there is a bunch of pointers to take in to account and to update (e.g.
yy_marker, yy_text, etc).



Yeah, I'm pretty sure that's how most of the example re2c code is setup:

#define YYFILL(n) {cursor = fill(s, cursor);}

uchar *fill(Scanner *s, uchar *cursor){
if(!s->eof){
  unint cnt = s->lim - s->tok;
  uchar *buf = malloc((cnt + 1)*sizeof(uchar));
  memcpy(buf, s->tok, cnt);
  cursor = &buf[cursor - s->tok];
  s->pos = &buf[s->pos - s->tok];
  s->ptr = &buf[s->ptr - s->tok];
  s->lim = &buf[cnt];
  s->eof = s->lim; *(s->eof)++ = '\n';
  s->tok = buf;
}
return cursor;
}



-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [PATCH] Scanner "diet" with fixes, etc.

2009-05-04 Thread shire



Hey Matt,

Thanks for posting, sorry for not having a chance to reply to this sooner.  
Maybe couple things from the patch,


+/* To save initial string length after scanning to first variable, 
CG(doc_comment_len) can be reused */
+#define double_quotes_scanned_len CG(doc_comment_len)
+


(minor) Maybe we should rename this var if we're going to use it for other 
purposes, this doesn't really save any typing.  Also if we do want the define 
maybe we should upper case it so it's more obvious?


+   while (YYCURSOR < YYLIMIT) {
+   switch (*YYCURSOR++) {


In the example above, which we have a couple examples of here, we don't obey 
the YYFILL macro to detect if we have exceeded our EOF.  This *might* be a 
problem, but only really depends on if we intend to use the YYFILL as a 
solution for exceeding our mmap bounds.

Regarding the ZEND_MMAP_AHEAD issue and the temp. fix that Dmitry put in we 
need to find a solution to that, perhaps I can play with that this week too as 
I think I'm seeing some related issues in my testing of 5.3.  Essentially we 
abuse ZEND_MMAP_AHEAD by adding it to the file size and passing it to the mmap 
call which isn't at all valid and only really works up to PAGESIZE.  We could 
possibly use YYFILL to re-allocate more space as necessary past the end of file 
to fix this.

I don't see anything glaring in the patch that's a major issue, I can probably 
test more on a larger code base in the next 2-3 days.  As I've said before this 
seems to be crossing the line of us writing a scanner by hand rather than 
letting re2c do the heavy lifting, but without a modification to re2c to handle 
EOF I don't have an alternative solution currently.  (If we had some way to 
detect which regex we where matching against in the YYFILL that would likely be 
able to handle these bugs, but I didn't see a way to do that easily).

-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] RC2 and integer/float handling in 5.3

2009-04-06 Thread shire


Hey Matt,

Matt Wilmas wrote:

Yep, 5.3's snapshot self-compiled from a couple days ago on Windows (not
that that should matter). (I'm not regenerating it with re2c, which also
shouldn't matter; using the existing .c file. I haven't touched the
scanner stuff in a long time (yet) to regen.) Scanner of course hasn't
changed since then.


Here's what I'm currently doing (more or less with some changed paths):

$ echo $PHPCVS
:pserver:sh...@cvs.php.net:/repository

$ cvs -d $PHPCVS co -r PHP_5_3 php5
...

$ cd php5

$ ./buildconf && ../config.nice && make
...

$ cat ../bug46817-1.php

  array(3) {
[0]=>
int(368)
[1]=>
string(6) "
int(1)
  }
  [1]=>
  array(3) {
[0]=>
int(366)
[1]=>
"   string(57) "// this comment and trailing blank contain windows CR+LF
[2]=>
int(2)
  }
  [2]=>
  array(3) {
[0]=>
int(371)
[1]=>
string(3) "

"
[2]=>
int(2)
  }
}


The newlines look like this in the second file:


Test case is the one in the bug report. :-) Last token is not the
comment, but whitespace.


There are two reproductions in the bug report ;-)




Also, the unterminated comment Warning is still missing with "

I think fixing this would be great as well as the other highlighter test that 
was changed.  I would just prefer that the scanner handle these rather than us 
implementing what is essentially a hand-written scanner within the lexer file.


-shire



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] RC2 and integer/float handling in 5.3

2009-04-06 Thread shire

Matt Wilmas wrote:

Oh, other than that about the RC2 subject, I was also going to propose
some scanner changes like I mentioned to Brian on the 26th. Assuming it
works like in my head, it would basically be a big "diet" for the
scanner, making string/comment handling smaller and simpler, while
addressing the cause (for strings/comments at least) of Bug #46817
(tokenizer misses last single-line comment) which is wrongly marked
Closed and DONE on the wiki. Before RC1, Brian reverted his changes
that, I guess, had fixed it.


It is?  I just tested this again to be sure something wasn't missed and it 
looks like it's working to me.  Which test case where you using?

-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: [PHP-CVS] cvs: php-src(PHP_5_3) /ext/standard/tests/strings highlight_file.phpt ZendEngine2 zend_language_scanner.c zend_language_scanner.l zend_language_scanner_defs.h zend_stream.h

2009-03-26 Thread shire


Hey Matt,

Matt Wilmas wrote:

Hi all,

Brian, not sure if you're still trying to work on other scanner changes
to fix things, but wanted to say that I should have something (soon,
whenever) to fix the issues with \0 in strings/comments and/or YYFILL
returning too soon (maybe that's fixed, don't know).


Yep I checked in fixes for this, the scanner should now work with \0 bytes in 
these tokens.  I believe Nuno has thoughts on what he'd like to see done here 
instead but it works AFAIK. (there is a highlight change that happens because 
of re2c changes in 5.3, but I think there may be other fixes for this as it's a 
non-terminated string being highlighted as a comment).

btw this specific issue Lukas is bringing up has more to do with dealing with 
incorrect usage of MMAP to provide padding at the end of file for the scanner. 
(I documented this in the bug report if you're interested in more details).


Basically, my idea
is to just do a "manual scan" in a few places, and take re2c out of the
equation there. Would also make code smaller/simpler, and maybe a bit
faster. e.g. good even if re2c is fixed.


This sounds similar to what we do for anything outside the  tokens, but 
(without having seen your patch of course) I feel like we should have a scanner that 
handles this correctly rather than having to rewrite portions in C to handle these 
cases.  However I don't exactly have a patch ready for re2c so if this in some way 
improves the code as it currently stands, then so be it ;-).  Perhaps you and Nuno 
etc should discuss this more here as I know he isn't completely satisfied with the 
current implementation.


Like I've said previously, I don't understand how re2c can't correctly
deal with EOF. In the simplest form, if you had one re2c rule:

[a-z]+

It wouldn't even match with "foo" as the input string! :-/


I'm not sure I follow why the above case wouldn't work, but my thinking here is 
that if EOF where treated as a unique regex expression that was conditional on 
YYCURSER >= YYLIMIT (or probably something more concrete).  then \0 would not 
be the same as EOF, and it would be much easier to match for all these cases.  I 
would worry that this could be potentioally complicated for re2c to implement, but 
I don't know enough about re2c and haven't heard from anyone regarding the 
possibility of doing this.   It's my understanding that this would be essentially 
the same way flex opperates. (at least from the users perspective).

-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: [PHP-CVS] cvs: php-src(PHP_5_3) /ext/standard/tests/strings highlight_file.phpt ZendEngine2 zend_language_scanner.c zend_language_scanner.l zend_language_scanner_defs.h zend_stream.h

2009-03-26 Thread shire

Lukas Kahwe Smith wrote:

Bringing this back to the list since its still relevant apparently.

On 26.03.2009, at 19:43, shire wrote:

There was another bug here http://bugs.php.net/bug.php?id=47596 that
is slightly related, I made an update in the comments. Dmitry just put
in a temporary fix so that we can release this way if needed, however
we should make a longer term fix for the actual problem at some point.



Dmitry?



Just to be clear here, Dmitry gave us a heads up that his fix should be reverted once we 
come up with a better fix.  I think Marcus, Nuno, Scott, myself, and whomever else can 
contribute need to determine/implement a better fix for the mmap call.  Ideally I think 
we'd like a flex-like <> regex expression in re2c, but it would be helpful 
to get more feedback from those familiar with re2c who could implement this.  Alternatively 
we can probably dynamically allocate the data via YYFILL and adjust SCNG accordingly.  
Interested in other suggestions, as my scanner experience is limited.

-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] max_execution_time and async signal handling in apache2handler

2009-03-15 Thread shire


Hi Moriyoshi,

Moriyoshi Koizumi wrote:

Hi,

I got a bug report on the Japanese PHP user's list that states free()
aborts within the timer signal handler due to reentrance to the
function when max_execution_time takes effect and the signal occurs
within the same libc function. The reporter also states he uses
apache2handler, which doesn't provide block_interruptions nor
unblock_interruptions SAPI handlers in contrast to the apache1
handler.

Whilst I doubt this happens quite frequently because PHP has its own
memory pool and it's far more rare that the libc's allocators get
called than the emalloc() and efree(), this should be addressed
somehow. A proposed fix is attached but note this greatly compromises
overall performance due to excessive system calls.


Isn't this  already covered in a proposal here: 
http://wiki.php.net/rfc/zendsignals?  I believe the last hang-up keeping this 
from getting committed is in the ability to handle this properly and 
efficiently under ZTS builds, I would really like to see this get applied in a 
future PHP release though.

-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: [APC-DEV] Re: [PHP-DEV] [RFC] APC/PHP Lazy Loading

2009-03-14 Thread shire

Johannes Schlüter wrote:

On Fri, 2009-03-13 at 18:27 +0100, Kalle Sommer Nielsen wrote:

I must admit that I think it would be bad to delay 5.3 more and have
another beta because of this,


Well, according to Dmitry (didn't test it myself) this doesn't bring
real improvements so I don't think it's needed.


I want to emphasize that this does provide real performance increases, it's 
just going to depend heavily on the code structure.  Some projects may see 
gains, others may not.  Per my original post I'm interested in getting more 
testing done on various projects to see how general the gains are and what I 
can do to improve them.

To also re-iterate my previous emails I do agree that this probably isn't the 
best time to include this in 5.3, but would of course work to help get it in if 
the majority wanted to see that happen.  Considering the merge errors that 
Dmitry found I think it would be preferred to get this tested more before it's 
included in a major release, as I had originally intended.  I appreciate the 
code changes Dmitry has contributed and his initial testing.



  however if possible I would be all for
having APC in RC1 next week :)


This was asked on IRC multiple times and always said something along the
lines "talk to the current APC maintainers and if they think it's worth
propose to internals" as the current APC maintainers can judge stability
and risks better and have a better knowledge about missing 5.3 support.
But nobody seemed t be that interested.

By looking at the bug tracker I see 63 open APC bugs including missing
support for 5.3 features (http://pecl.php.net/bugs/bug.php?id=15901) and
some crashes from that I think APC benefits from an independent release
schedule. Additionally APC doesn't offer core functionality and will
need configuration to be useful (temp path, shm method, segment
sizes, ...) so I don't think the additional installation step is that
much of additional trouble.



I'm trying to get the majority of the APC bugs resolved, but I think this will 
take some time.  There's also some new and undocumented functionality added to 
APC as well as some major changes I plan to create in an APC-4.0 branch.  So I 
don't think this would be the best time to integrate it into core, especially 
considering the late notice.


-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] 5.3 items

2009-03-11 Thread shire

shire wrote:


Hey Lukas,

Just a heads up that I should have a fix for this soonish, just running
some more tests to make sure everything works as expected (I assume
nobody else has started work on this):

9. tokenizer misses last single-line comment
(http://bugs.php.net/bug.php?id=46817)



Ok, just checked this in, let me know if I missed anything or if there's any 
issues.  I noticed that CVS HEAD seemed to have dos line feeds and some other 
bad merge stuff so those where fixed as a side affect of my commit.  This 
should also correct some highlighting bugs that were introduced (see the fix to 
the highlight_file test).

-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: [APC-DEV] Re: [PHP-DEV] [RFC] APC/PHP Lazy Loading

2009-03-11 Thread shire

Lukas Kahwe Smith wrote:

Can we get this patch to release quality by this weekend?
So that people can test it on Monday/Tuesday ahead of RC1?


I don't see this being a problem, I do have a few items I'd like to point out 
for feedback/suggestions:

1) Currently it doesn't support method level lazy loading, it's probably 
advisable to wait and include this later, but if everyone thinks it's 
significant enough we could probably add support for that.  I'm not sure on all 
the details this involves, perhaps someone familiar with method calls could 
suggest difficulty and/or optimized ways to do the lookups.

2) Currently I've inserted these hook calls into everywhere we call/lookup 
classes.  This has the downside that any extension wanting to mess with the 
function table, lookup entries, etc will need to be aware of the callback 
hooks.  I think a better architecture is to create an API for function table 
and class table operations.  Perhaps this could be done later if we want this 
likely more stable version in 5.3, and perhaps I'm overstating the significance 
of other functions doing these sort of things and not being aware of this new 
feature.  (I believe dmitry mentions several extenions that need changes to 
support this).  On the upside not creating an API means existing code will 
still work if they don't use lazy loading. ;-)

3) The largest portion of time currently is simply adding dummy entries to the 
lazy hash tables so we can detect name collisions and availability, my next 
goal is to improve the performance of this by either creating a faster copy of 
the entries or determine some better method of performing lookups/marking 
functions as available (something like lookups directly into the APC shared 
memory segment).  Just putting this out there in case anyone has some 
interesting suggestions.


I think all the above 3 items don't necessarily need to be done to have this 
work in 5.3 (and #3 is pretty much APC/opcode cache centric) and might cause 
unecessary complication for an initial support of this, but what does everyone 
else think?



Regarding a couple of Dmitry's suggestions:


1) I would prefer to add additional hash_value argument into 
lookup_function_hook() and lookup_class_hook to prevent multiple calculation.


I'm upset I didn't do that in the first place ;-)


2) function_exists() should use lookup_function_hook().
3) interface_exists() and class_alias() should use lookup_class_hook().


I'm pretty sure I already have support for this, it may have been lost while 
porting it over, I'll double check.


I'll follow up with Dmitry on getting these and any other items taken care of, 
thanks!

-shire



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] APC/PHP Lazy Loading

2009-03-11 Thread shire


Dmitry Stogov wrote:

Hi,

Personally, I like the patch except for some small possible tweaks, and
I believe it can't make any harm with lazy loading disabled.


Thanks, what are the tweaks you'd like to see so I can try to include them?


Could you provide some benchmark results?


I was hoping to solicit some from others on the list, but haven't seen anything 
yet.  My best example of gains is Facebook, I believe these where around 20-30% 
decrease in CPU usage on a bare-bones page.

I did test Joomla and Zend Framework, the gains here aren't much if anything as 
they seem to be use autoload for most files.  (although I would like to see 
what lazy method loading can do here).

I intend to benchmark wordpress as well.  I'll post more benchmarks for the 
above and this once I make some more tweaks that also might give us better 
results.  I anticipate that benchmarks are going to vary pretty drastically 
depending on code structure, size, autoloading, etc.


APC patch to play with it and see advantages/disadvantages?


I've gone ahead and checked in the current code for APC, so you'll have the 
necessary changes if you checkout CVS HEAD.

-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] APC/PHP Lazy Loading

2009-03-10 Thread shire

Lukas Kahwe Smith wrote:


On 22.02.2009, at 01:10, shire wrote:



I've just checked into APC CVS preliminary support for Lazy Loading
classes and functions. This means that rather than copying function
entries into EG(function_table) and EG(class_table) when an include
happen it will mark the functions/classes as available and only
actually insert them into the tables when they are called. This is
done via hooks added into the various hash table lookups in PHP. I've
placed a patch for PHP_5_3 at:

http://tekrat.com/downloads/bits/apc_lazy_php53.patch



I did not read through the entire thread. As things are close to RC
state in 5.3, I would prefer to not do any non bug fixes at this stage.
Then again if the benefits are huge and the risk is low it can be
considered of course ..



Yep, I should have been clearer here. ;-)   I don't believe this patch is ready 
to be committed and I'm not advocating for inclusion in the php53 release.  I 
have some more adjustments I'd like to make to both the APC code and this 
patch, both for optimization, cleanliness, and to ensure it's the best possible 
implementation for future enhancements (such as lazy loading methods).  I'm 
more interested in getting constructive feedback and any benchmark results so I 
can make adjustments to the current code.  However, I would like to come back 
with improved patches and get it included in the next major release as 
appropriate.

I'm all about fixing the remaining php53 todos/bugs and getting this release 
out!  I hope posting this wasn't a distraction from that.

Thanks!

-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] 5.3 items

2009-03-09 Thread shire

Matt Wilmas wrote:


I don't have much time right now, but looked at it quick, and see that
you're actually trying to work around the re2c issues in general. :-) I
was only thinking of putting a "band-aid" on the comment symptom(s),
since those are about the only ones that occur with valid code (is the
tokenizer ext. *supposed to* handle all tokens in code that wouldn't
really compile?).


Yeah I figured I should try to fix as much as a could, specifically the YYLMIIT 
not enforcing availability of 'n' chars makes me nervous. ;-)

I would expect that tokenizer should handle all tokens in code as long as they 
pass the scanner phase (not the parser phase) but I'm not sure on what the 
intention here is.



And yeah, about excluding \x00 from ANY_CHAR, it could
change things, since it's always been allowed, although it seems strange
that code would have literal NULLs in it (generated eval()'d code?).
That was part of the reason I couldn't come up with a generic fix while
keeping all behavior. If re2c would just remember the last matching
state it was in at EOF like Flex!



It seems to me like the crux of the problem here is that we can't integrate an EOF check 
(such as checking the length of data) within the regular expression.  While flex allows the 
<> we are expected to provide a unique identifier/token to match on.  This 
assumes that we have a unique character, or that the data is in good form so that we can 
detect a token etc.  Perhaps a good feature to add to re2c would be able to include a 
special regex/token match that would identify special conditions programatically such as 
(YYCURSOR == YYLIMIT) etc.

In defense of re2c I think it could be useful in situations to have to 
explicitly handle EOF, as it allows you more freedom for processing different 
data types.

I'll have to look closer at the multi-byte processing as well.  I don't see a 
lot of cases where we would run into \x00 values in code.  (Perhaps someone can 
provide a suggested use case that we need to watch out for?)  Perhaps if 
someone is including binary data strings within code?.



Otherwise, I don't know what to do. :-/ I'm going to do something else
before trying to implement what I was going to do, so there's no patch
yet...



Ok, I'll keep working on this I guess then as there's a couple more tests I 
want to run and fix some things before I commit (like ensuring that YYLIMIT 
actually ensures there are 'n' bytes available to read, etc).
 


As far as the Warning, with "

I didn't see this in the current, un-patched, php-5.3 build but I'll double 
check to make sure I wasn't still using my new binaries.

 

And that applies to the case Lukas gave in the bug report: WHITESPACE
pattern is variable length.


Didn't see/find this is there a bug # or link?


I meant the "could be related if not the same problem" comment added the
other day in Bug #46817.



Ah, I see.  Yes this was actually my friend that raised my attention about 
getting this fixed ;-)


-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] 5.3 items

2009-03-09 Thread shire


Hey Matt,

Matt Wilmas wrote:


9. tokenizer misses last single-line comment
(http://bugs.php.net/bug.php?id=46817)


I was going to take care of that one, as I mentioned in a previous
message, though it's been awhile since I've been delayed much longer
with stuff here. :-( (Nothing set up for building PHP on this system
yet; hope to in the next several hours finally, and do some things!)



Sorry I missed you're earlier email.  I saw this sitting on the 5.3 todo list and it was 
breaking some of our parsing so I figured I'd take a stab at it.  Here is my current 
patch http://tekrat.com/downloads/bits/php53.scanner_eof.patch, please let me know if you 
have some suggestions/changes.  It sounds like you commented on this initially so please 
let me know what you/we should do ie: merging my patch/your work, commiting this, or if 
you had a better fix in mind etc.  My biggest complaint is that my current patch requires 
adding \x00 to any exclusion rules ("[^").

These changes for handling EOF should probably be ported to the INI scanner as 
well for the above reason and to keep them similar.


As far as I know there's still the other comment-related issue where no
Warning is giving about "Unterminated comment ..." for unclosed /* ...
It's all of course related to the fundamental re2c issue, for now, where
when the scanned input ends while a variable length part of a rule is
being matched, it just aborts ("return 0;") in YYFILL().


I don't seem to see this problem, perhaps I'm not reproducing it correctly?



And that applies to the case Lukas gave in the bug report: WHITESPACE
pattern is variable length.


Didn't see/find this is there a bug # or link?



-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] 5.3 items

2009-03-09 Thread shire


Hey Lukas,

Just a heads up that I should have a fix for this soonish, just running some 
more tests to make sure everything works as expected (I assume nobody else has 
started work on this):

9.  tokenizer misses last single-line comment 
(http://bugs.php.net/bug.php?id=46817)

What's the details on this one?

20. memory leak in the scanner when a new state stack is created


-shire



Lukas Kahwe Smith wrote:

Hello All,

I am back from my vacation in Tanzania. I will be in Innsbruck over the
weekend for some Frisbee action, but I hope to get back into the RM
business Sunday evening or Monday. I went through all my emails
yesterday and marked several for reading, which I will do on the train
ride if all goes well. That being said, it would also be useful if
anyone who is looking after issues that are still open and that should
be addressed before the next release (beta2 or RC1) can send Johannes
and myself a quick email. If anything needs discussion raise it on
internals ASAP.

regards,
Lukas Kahwe Smith
m...@pooteeweet.org







--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: [APC-DEV] [RFC] APC/PHP Lazy Loading

2009-02-26 Thread shire

Rasmus Lerdorf wrote:

shire wrote:

I agree for the general case, in our development environment though this
might cause some pains.  But we could always start there and see how it
goes.   I agree that Xdebug isn't really a use case we always need to
optimize for.


Is it ever a case we need to optimize for?  If you are running xdebug,
you aren't worried about execution speed.  You certainly aren't going to
be running your PHP under xdebug in any sort of production environment.



I agree I don't see a case for ever using XDebug in a production environment 
with APC (unless you're providing some sort of developer service, and even 
then).  Obivously the user cache needs to function as you could be debugging 
something related.  In our development environment I'm just not sure how much 
additional time this will cost us in addition to xdebug.  I believe it takes 6 
seconds (roughly there might be other crud going on there) for us to compile 
all the code into opcodes for execution, so tacking this on to xdebug might 
cause some headaches for developers.

But like I said, I think this case is pretty extraordinary, so for the sake of 
getting this great feture in if we need to disable this then I think I'm fine 
with figuring out some other solution to it should it actually be a problem at 
a later date. Even if I can disable this in APC then that might be enough to 
work around it.

-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: [APC-DEV] [RFC] APC/PHP Lazy Loading

2009-02-26 Thread shire

Rasmus Lerdorf wrote:

Gopal V wrote:

shire wrote:


http://tekrat.com/downloads/bits/apc_lazy_php53.patch

You should be able to apply the above patch to the latest PHP_5_3
branch, and recompile the latest APC CVS against it.  Two ini settings
enable/display lazy loading:

apc.lazy_functions=On/Off
apc.lazy_classes=On/OFf

Awesome!


Yup, I am all for the lazy loading patch as well.  I don't think people
really realize how much code they load that is never run.  Especially
all the folks with massive front controllers.



Thanks, I would be interested in hearing how successful it is on different 
setups just to validate it.  I'll start making some more cleanups to the 
patches and more optimizations in the next month or two so hopefully we'll see 
some additional improvements in the code and performance.






Alternative implementations would include replacing the function
entries with an internal function that would load the opcodes when
called, however I found this implementation to be problematic, still
requires changes to PHP, and would also require inserting entries into
the function/class tables which itself ends up being an expensive task
for extremely large codebases.

I still haven't given up on the executor hooks. But can't argue with
code that works (yes, it works for most of my tests).



By executor hooks do you mean actually the executor hooks in Zend, or replacing 
a stub in the function table like I described?  I like the idea of stubbing out 
function table entries, but I think this makes it difficult to avoid the costs 
of updating the function table, I'm hoping that I can make some optimizations 
that cut this down so I guess we'll see in a while if I'm able to do that as it 
might shed some light on how much we can gain by going one way or the other 
etc.  If you're thinking of something else (I know we discussed this before, 
but maybe I misunderstood), I'd like to hear the details.  ;-)





I should finish up the RO patches in place so that we can catch stuff
being overwritten in shm without locks - reflection, xdebug and suhosin
looks like potential culprits here.



I thought this was awesome when I read your blog post, should be great!




I wouldn't worry about xdebug at all.  We should probably just turn off
the opcode cache part when xdebug is active if it is a problem.


I agree for the general case, in our development environment though this might 
cause some pains.  But we could always start there and see how it goes.   I 
agree that Xdebug isn't really a use case we always need to optimize for.
 


-shire



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] APC/PHP Lazy Loading

2009-02-23 Thread shire



Ronald Chmara wrote:


On Feb 21, 2009, at 10:55 PM, shire wrote:

Hi Ronald,
Ronald Chmara wrote:

Wait... so if I understand this right, let's envision a code base where,
per some random page load, 70 functions are actually called, but, oh,
7,000, or even 700,000, are being included for whatever reason?
The speed optimization is in *not* copying a massive amount of things
that weren't even needed, or used, in the first place?

Essentially, yes, this is probably best summed up by the 80/20 rule
where we only use 20% of the code etc...


Well, I can see 80% actually *used* code, with 20% in there by
accident but 80% unused code? eep! ack! Call the villagers and get
the torches and pitchforks!...


I should probably be more specific, it's not that you *don't* use the other 80% 
it's that you use 20% of the code 80% of the time: 
http://en.wikipedia.org/wiki/Pareto_principle

Of course I'm abusing this slightly, but this is the *basic* idea, and it's not 
based on actual usage statistics.  In reality you probably use all 100% of your 
code in smaller chunks depending on the request.  As an illustrative example, 
if you have HTTP requests for setting, retrieving, deleting items in a database 
which is supported by corresponding functions.  In each request you'll only use 
one of these functions (30%), but you'll still need to load the entire 
file/class.  The goal of lazy loading is to optimize this and similar 
situations so you don't have to re-organize your code into unfeasibly small 
files, or in other ways that might inhibit productivity.

 

One thing I see as quite a beneficial future outcome of your work is the
ability to further profile code, and be able to seek out code that marks
massive amounts of functions as "available" without actually ever
using them.


I think this would be better instrumented through tools actually designed to do 
this sort of profiling, specifically XDebug, or tools like inclued are very 
useful.

http://pecl.php.net/package/xdebug
http://pecl.php.net/package/inclued



It's actually likely that only a fraction of the code at Facebook will
be used in a request, hence the need for lazy loading.


Ouch. Seriously.

I can't tell you how to build your code, but I think you might seriously
benefit from:

Does that make sense? Or did you try it already? :)



I'd rather not enter into a public discussion of Facebook's current 
optimization tactics outside of this patch, but suffice to say we have 
implemented many optimization techniques, both in PHP code and Internals, many 
of which we have and will continue to contribute back to the community.   Lazy 
Loading has been a very necessary and optimal solution to this one aspect of 
our scalability, thus allowing our engineers to focus more of their time and 
energy on creating new and exciting features for Facebook.


Thanks,

-shire

 




--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] APC/PHP Lazy Loading

2009-02-21 Thread shire


Hi Ronald,

Ronald Chmara wrote:

Wait... so if I understand this right, let's envision a code base where,
per some random page load, 70 functions are actually called, but, oh,
7,000, or even 700,000, are being included for whatever reason?

The speed optimization is in *not* copying a massive amount of things
that weren't even needed, or used, in the first place?


Essentially, yes, this is probably best summed up by the 80/20 rule where we 
only use 20% of the code etc...



However, there's still the horribly massive speed hit of semi-loading,
and marking, a fairly large amount of unused, un-needed, functions, as
available?


I don't agree with the description of describing this as a "horribly massive speed 
hit" at least in comparison with what was happening without lazy loading.  Also, 
like I said there's further iterations I plan to make here, one of these being increasing 
the performance of this marking functions as available.



I do see the benefit of lazy loading, I'm just not very comfortable with
enabling a philosophy of loading up a massive amount of CPU and RAM with
"just in case they're wanted" features and code in the first place.


Well I am assuming that this is what a large amount of code does already, 
except that without lazy loading the situation is significantly worse.  Your 
point that we should be sure this does not encourage poor coding practices is 
well taken, but it's been my experience that code tends to take this form 
regardless so I'm hoping to make the best of the situation ;-).

Also keep in mind that there are cases where you may not know in advance which 
functions you will/will not call, but it's probably fair to say that the 80/20 
rule still holds, so including all the functions you may need is not 
particularly a misuse of the language, but rather a necessity of a dynamic 
application and language.



It certainly can boost an APC code set such as facebook, where many of
those files and functions *will* likely be used in the next 20 minutes
or so, but I also fear that it will encourage programmers to load
everything they have, every time, just in case they need it and 2Gb
apache processes (and APC space) can be ugly.


I'm not entirely clear on where code being used in the next 20 minutes come 
into play, what differenc does 100 milliseconds vs. 20 minutes make in APC/lazy 
loading?   It's actually likely that only a fraction of the code at Facebook 
will be used in a request, hence the need for lazy loading.  But we need to 
make all the functions available due to the dynamic nature of the site and PHP 
itself as we don't know ahead of time which functions we will need to call.

Thanks for the feedback, and hopefully the above makes sense.  I don't want to 
encourage bad progarmming form and would definitely encourage avoiding this 
situation if possible, however I think many applications may find this 
optimization a necessity.

-shire



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] [RFC] APC/PHP Lazy Loading

2009-02-21 Thread shire


I've just checked into APC CVS preliminary support for Lazy Loading classes and 
functions.  This means that rather than copying function entries into 
EG(function_table) and EG(class_table) when an include happen it will mark the 
functions/classes as available and only actually insert them into the tables 
when they are called.  This is done via hooks added into the various hash table 
lookups in PHP.  I've placed a patch for PHP_5_3 at:

http://tekrat.com/downloads/bits/apc_lazy_php53.patch

You should be able to apply the above patch to the latest PHP_5_3 branch, and 
recompile the latest APC CVS against it.  Two ini settings enable/display lazy 
loading:

apc.lazy_functions=On/Off
apc.lazy_classes=On/OFf

There's still some enhancements that I need to make in both PHP and APC for 
cleaner code and optimizations, but I wanted to get some early feedback to 
track down issues.  In a final version I'd prefer the above patch to abstract 
all function lookups into a common function.  I would love to hear about 
success/problems with this, as well as performance results or other 
suggestions.   This was initially implement for Facebook's codebase, and 
dropped CPU usage by about 30%, today it's required for the site to operate 
normally.  Although I expect smaller gains than this in other codebases 
(wordpress looks to be about 3%, Joomla and Zend Framework use autoloading and 
appear to get no visible gains).

Alternative implementations would include replacing the function entries with 
an internal function that would load the opcodes when called, however I found 
this implementation to be problematic, still requires changes to PHP, and would 
also require inserting entries into the function/class tables which itself ends 
up being an expensive task for extremely large codebases.

I look forward to hearing your feedback,

-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] 5.3 todos

2009-02-18 Thread shire

Marcus Boerger wrote:

Hello shire,

Thursday, February 12, 2009, 8:02:06 PM, you wrote:



Lukas Kahwe Smith wrote:

The following remain open and it does not seem someone is actively
working in it:
- PHP_5_3 missed merge from PHP_5_2 for write_func callback



Seeing as I have an interest in this getting in 5_3, I'll work up a
patch for this unless someone wants to speak up that they've got it.  I
don't have Karma to Zend though, so I'll need someone to commit for me. ;-)


He there, thanks for looking into this (I sadly really cannot find much
more time these days than allows me to occasionally read PHP mails).
Thus I gave you karma now. If you want I can of course review your patch.


Thanks Marcus!  The patch should be in CVS now (via Dmitry) let me know if you 
or anyone else see any issues.

-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] 5.3 todos

2009-02-12 Thread shire


Lukas Kahwe Smith wrote:

The following remain open and it does not seem someone is actively
working in it:
- PHP_5_3 missed merge from PHP_5_2 for write_func callback


Seeing as I have an interest in this getting in 5_3, I'll work up a patch for 
this unless someone wants to speak up that they've got it.  I don't have Karma 
to Zend though, so I'll need someone to commit for me. ;-)

-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] CG(zend_linecol)

2009-02-05 Thread shire

Sebastian Bergmann wrote:

  Would it be possible to implement CG(zend_linecol) in addition to
  CG(zend_lineno)? As far I can see, this is a missing prerequisite for
  Xdebug to report more fine-grained code coverage, for instance.



I've been thinking about this also, and would find this useful for some 
profiler/visualization purposes and perhaps for inclusion in errors.

-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] PHP_5_3 missed merge from PHP_5_2 for write_func callback

2009-01-27 Thread shire


I came across a problem in PHP_5_3 and HEAD because I was using a feature added in 
PHP_5_2.  It doesn't look like it was merged to HEAD to me, and then due to a 
merge from HEAD->PHP_5_3 it removed the changes made in PHP_5_2.  This was for 
the write_func call back in print_zval_r_ex.

Original commit to PHP_5_2: 
http://cvs.php.net/viewvc.cgi/ZendEngine2/zend.c?r1=1.308.2.12.2.10&r2=1.308.2.12.2.11

The merge from HEAD to PHP_5_3 that reverted it: 
http://cvs.php.net/viewvc.cgi/ZendEngine2/zend.c?r1=1.308.2.12.2.35.2.2&r2=1.308.2.12.2.35.2.3&;

Let me know if I missed something or if it would help if I put together a patch 
to rectify it.  I'm currently using this to print out values in the xLog 
extension (http://tekrat.com/php/xlog/)

Thanks,
-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] PATCH: zend_mm_heap_overflow()

2009-01-19 Thread shire

Stanislav Malyshev wrote:

Hi!


Yes, I like this idea better as it's more flexible but I wasn't sure
if we wanted that much visibility into the global variables of the
allocator. I suppose though, as with other things of this nature, if
you're mucking with this data then you should be doing so at your own
risk etc. ;-)


Well, messing up AG is not worse than messing up EG or CG - you'll end
up crashing pretty soon anyway :) Only problem might be that it may
introduce binary dependencies that will limit what we can do in memory
manager between versions, so we need to consider this carefully.


Yeah, definitely makes sense.  I'm also going on the assumption that this might 
be useful at some point later on, rather than my single use case.  Of course if 
it isn't then we don't have to worry about binary compatibility either lol.

I hadn't considered creating a hybrid approach to this, perhaps that might be a 
good alternative.  Rather than exposing the entire structure or creating 
one-off access functions (which has it's own BC issues), perhaps we should 
expose a new structure that is just a portion of the existing heap structure so 
we can expose only those items that we're willing to support.  This won't fix 
everything, but at least it might mitigate problems associated with opening up 
the entire existing heap structure and allow some flexibility with BC if it 
does change.

If it's interesting to everyone and saves you some time I can work up a patch 
so we can see what that might look like too.


Thanks!

-shire  


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] PATCH: zend_mm_heap_overflow()

2009-01-19 Thread shire

Stanislav Malyshev wrote:

Hi!


I'm releasing an extended PHP logging extension that we currently use
at facebook with much success. I currently use a small patch to
determine if a memory overflow has occurred as there's currently no
direct access into the allocator structures. You can get more
information on the project at http://tekrat.com/php/xlog/.


Maybe just make AG() exported?


Yes, I like this idea better as it's more flexible but I wasn't sure if we 
wanted that much visibility into the global variables of the allocator.  I 
suppose though, as with other things of this nature, if you're mucking with 
this data then you should be doing so at your own risk etc.  ;-)

-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] PATCH: zend_mm_heap_overflow()

2009-01-19 Thread shire

John Carter -X (johncart - PolicyApp Ltd at Cisco) wrote:

Shire,

Xlog looks really useful. Does it also help with "exception thrown
without a stack trace"?



Thanks John,

I was just made aware that xdebug has similar functionality with a little bit 
different approach, config, etc so do check it out as well to see which meets 
your needs best.  I'm sorry that I wasn't aware of it's fatal catching 
capabilities before.

It depends on what you want to get from "Exception thrown without a stack 
frame" (I assume this is the error you meant not, stack trace).  There is no 
backtrace information available when this error occurs so nothing is going to get you a 
backtrace that i know of, however if there's some specific piece of data you'd like to 
have in the logs perhaps I can check to see if that's a possibility.  However in some 
states of the engine, getting anything useful can often be diffucult or unstable.

-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] PATCH: zend_mm_heap_overflow()

2009-01-18 Thread shire


I'm releasing an extended PHP logging extension that we currently use at 
facebook with much success.  I currently use a small patch to determine if a 
memory overflow has occurred as there's currently no direct access into the 
allocator structures.  You can get more information on the project at 
http://tekrat.com/php/xlog/.

It would be useful if this patch (and perhaps more accessors into the memory 
allocator) where added.  Although I understand there should be some careful 
choices and limitations here, and this case is a pretty specific use case, but 
I thought I'd share it in case there are others who happen to find this useful 
as well or perhaps someone can propose a more general alternative.

Patches for different branches are here (I've pasted php53 below):

http://tekrat.com/downloads/bits/zend_mm_heap_overflow.php6.patch
http://tekrat.com/downloads/bits/zend_mm_heap_overflow.php53.patch
http://tekrat.com/downloads/bits/zend_mm_heap_overflow.php52.patch


diff --git a/ZendEngine2/zend_alloc.c b/ZendEngine2/zend_alloc.c
index 8853d06..b8884a0 100644
--- a/ZendEngine2/zend_alloc.c
+++ b/ZendEngine2/zend_alloc.c
@@ -2537,6 +2537,13 @@ ZEND_API void start_memory_manager(TSRMLS_D)
 #endif
 }

+/*** BEGIN Patch: zend_mm_heap_overflow ***/
+ZEND_API int zend_mm_heap_overflow(TSRMLS_D)
+{
+return AG(mm_heap)->overflow;
+}
+/*** END Patch: zend_mm_heap_overflow ***/
+
 ZEND_API zend_mm_heap *zend_mm_set_heap(zend_mm_heap *new_heap TSRMLS_DC)
 {
  zend_mm_heap *old_heap;
diff --git a/ZendEngine2/zend_alloc.h b/ZendEngine2/zend_alloc.h
index d92df4b..3610931 100644
--- a/ZendEngine2/zend_alloc.h
+++ b/ZendEngine2/zend_alloc.h
@@ -231,6 +231,7 @@ struct _zend_mm_storage {
 };

 ZEND_API zend_mm_heap *zend_mm_startup_ex(const zend_mm_mem_handlers 
*handlers, size_t block_size, size_t reserve_size, int internal, void *params);
+ZEND_API int zend_mm_heap_overflow(TSRMLS_D); /* Patch: zend_mm_heap_overflow 
*/
 ZEND_API zend_mm_heap *zend_mm_set_heap(zend_mm_heap *new_heap TSRMLS_DC);
 ZEND_API zend_mm_storage *zend_mm_get_storage(zend_mm_heap *heap);



-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] PATCH: bug #46701

2009-01-05 Thread shire

Felipe Pena wrote:

Hi,

Em Sáb, 2009-01-03 às 11:20 -0800, shire escreveu:

Felipe Pena wrote:

Em Sáb, 2009-01-03 às 10:31 -0200, Felipe Pena escreveu:

Em Sáb, 2009-01-03 às 10:27 -0200, Felipe Pena escreveu:

Em Sex, 2009-01-02 às 21:03 -0800, shire escreveu:

I've created a patch for bug #46701 (http://bugs.php.net/bug.php?id=46701) but 
it requires Zend changes, it can be found at the links below for all branches.  
I've verified all tests pass.  We may want to verify it for other architectures 
due to the nature of it being a float conversion problem, I tested on Intel OS 
X 10.5.5.  But this should at least be equivalent to $array[intval($double)] 
now.


http://tekrat.com/patches/bug46701.php6.patch
http://tekrat.com/patches/bug46701.php53.patch
http://tekrat.com/patches/bug46701.php52.patch



I suppose that should we also change the fetch (ZEND_FETCH_DIM_*)?

And probably:
case IS_DOUBLE:
index = (long)Z_DVAL_P(dim);
goto num_index;

And zend_execute.c (zend_fetch_dimension_address_inner) will be
superfluous with this change.

Errr, I mean, the case IS_DOUBLE stuff in
zend_fetch_dimension_address_inner will be superfluous.



In the end, I see that it requires the macro too, hehe.
So, a little change in your patch:

-   case IS_DOUBLE:
-   index = (long)Z_DVAL_P(dim);
+   case IS_DOUBLE: {
+   DVAL_TO_LVAL(Z_DVAL_P(dim), index);
goto num_index;
-
+   }


Hey, Good point Felipe, thanks!  I've updated my  patches as well to include 
this change and verified all the tests.



I've committed it in 5_3 and HEAD. Ilia, should it be in 5_2 too?

Thanks for the patch, Shire. ;)




Fantastic, thanks for the commit and additional fix!

-shire


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] PATCH: bug #46701

2009-01-03 Thread shire

Felipe Pena wrote:

Em Sáb, 2009-01-03 às 10:31 -0200, Felipe Pena escreveu:

Em Sáb, 2009-01-03 às 10:27 -0200, Felipe Pena escreveu:

Em Sex, 2009-01-02 às 21:03 -0800, shire escreveu:

I've created a patch for bug #46701 (http://bugs.php.net/bug.php?id=46701) but 
it requires Zend changes, it can be found at the links below for all branches.  
I've verified all tests pass.  We may want to verify it for other architectures 
due to the nature of it being a float conversion problem, I tested on Intel OS 
X 10.5.5.  But this should at least be equivalent to $array[intval($double)] 
now.


http://tekrat.com/patches/bug46701.php6.patch
http://tekrat.com/patches/bug46701.php53.patch
http://tekrat.com/patches/bug46701.php52.patch



I suppose that should we also change the fetch (ZEND_FETCH_DIM_*)?

And probably:
case IS_DOUBLE:
index = (long)Z_DVAL_P(dim);
goto num_index;

And zend_execute.c (zend_fetch_dimension_address_inner) will be
superfluous with this change.

Errr, I mean, the case IS_DOUBLE stuff in
zend_fetch_dimension_address_inner will be superfluous.




In the end, I see that it requires the macro too, hehe.
So, a little change in your patch:

-   case IS_DOUBLE:
-   index = (long)Z_DVAL_P(dim);
+   case IS_DOUBLE: {
+   DVAL_TO_LVAL(Z_DVAL_P(dim), index);
goto num_index;
-
+   }



Hey, Good point Felipe, thanks!  I've updated my  patches as well to include 
this change and verified all the tests.

-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] PATCH: bug #46701

2009-01-02 Thread shire


I've created a patch for bug #46701 (http://bugs.php.net/bug.php?id=46701) but 
it requires Zend changes, it can be found at the links below for all branches.  
I've verified all tests pass.  We may want to verify it for other architectures 
due to the nature of it being a float conversion problem, I tested on Intel OS 
X 10.5.5.  But this should at least be equivalent to $array[intval($double)] 
now.


http://tekrat.com/patches/bug46701.php6.patch
http://tekrat.com/patches/bug46701.php53.patch
http://tekrat.com/patches/bug46701.php52.patch


Thanks,
-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] PCRE symbol visibility bug

2008-12-09 Thread shire


When using GCC 4.x with php-5.3, and an extension (such as APC) that  
references PCRE functions (pcre_exec) that are bundled with PHP in the  
pcre extension.  The symbols defined in the PHP binaries don't include  
a visibility "default" attribute, and are currently set to "hidden" in  
CFLAGS.  This makes the symbols unavailable to any .so extension,  
which triggers a fault.


It seems like adding the visibility attribute to the bundled PHP code  
should fix this, but I wanted to post here as perhaps someone has a  
better suggestion.  It would probably be better if this didn't modify  
the pcrelib/* files.  I've included a patch as a possible fix, I can  
go ahead and commit this if this is what we want, but I'm hoping  
someone has some other suggestions.



before:
[EMAIL PROTECTED]:/usr/local/apache/libexec$ nm `which php` | grep  
pcre_exec

000266b4 t _php_pcre_exec

after:
[EMAIL PROTECTED]:/usr/local/apache/libexec$ nm `which php` | grep  
pcre_exec

000266b4 T _php_pcre_exec


cvs diff: Diffing .
cvs diff: Diffing doc
Index: pcre_internal.h
===
RCS file: /repository/php-src/ext/pcre/pcrelib/pcre_internal.h,v
retrieving revision 1.1.2.1.2.5.2.5
diff -u -r1.1.2.1.2.5.2.5 pcre_internal.h
--- pcre_internal.h 9 Sep 2008 07:55:08 -   1.1.2.1.2.5.2.5
+++ pcre_internal.h 9 Dec 2008 22:05:57 -
@@ -119,9 +119,17 @@
 #endif
 #  else
 #ifdef __cplusplus
-#  define PCRE_EXP_DECL   extern "C"
+#  if defined(__GNUC__) && __GNUC__ >= 4
+#define PCRE_EXP_DECL   __attribute__  
((visibility("default"))) extern "C"

+#  else
+#define PCRE_EXP_DECL   extern "C"
+#  endif
 #else
-#  define PCRE_EXP_DECL   extern
+#  if defined(__GNUC__) && __GNUC__ >= 4
+#define PCRE_EXP_DECL   __attribute__  
((visibility("default"))) extern

+#  else
+#define PCRE_EXP_DECL   extern
+#  endif
 #endif
 #ifndef PCRE_EXP_DEFN
 #  define PCRE_EXP_DEFN   PCRE_EXP_DECL


-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] An optimization idea

2008-11-10 Thread shire


On Nov 10, 2008, at 4:28 PM, Stan Vassilev | FM wrote:

This wouldn't really help with the case here of "if ($array1 ==  
$array2)..." though right?  (not to say it's not good, just making   
sure I understand ;-) ).


Yes I'm talking about speeding up scenarios full of hash lookups in  
general.


Ok, still seems though like this particular optimization might also  
provide additional benefits (albeit perhaps not nearly as significant).







It sounds like this would only work if the array contents where  
static though, as you're mapping a constant string to the contents  
of the  hash (or did I misunderstand and you'd be mapping string  
const. values  to hash IDs?).


My point is, replacing this process:

$a['foo'] or $a->foo -> compute hash of 'foo' -> find item for hash  
'foo' -> many items? -> resolve conflict -> return item


With this process:

$a[% string_literal_id_5 %] -> lookup item key 5 for array -> return  
item


Notice we skipped hash generation, and conflict resolution  
altogether. We only have the lookup for the integer id.
If some additional work is done, even this lookup can be eliminated  
and make this an O(1) process.


If instead the coder used variable:

$a[$bar] or $a->$foo (var array lookup and var var object lookup),  
then this optimization can't kick in, and the existing algorithm  
will be used.


However "static" access is the predominant usage, especially for  
objects, but also for arrays, so this should have significant impact.



Thanks for the clarification, this is pretty much the same idea as  
what I've been interested in working on next.  I think I was more  
inclined to store an extra hash value within the zvals themselves,  
with the hope that this could be expanded to non-constant values.  I  
believe ruby implements it's lookups this way (noted just for  
reference, not as an argument to copy another language ;-)  ).  Any  
thoughts on reasons not to do this (other than increasing the size of  
zval struct), it's pretty simple to implement this for static values I  
believe, dynamic values are a lot more difficult obviously...


-shire



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] An optimization idea

2008-11-10 Thread shire


On Nov 10, 2008, at 2:32 PM, Stan Vassilev | FM wrote:

I just ran a quick benchmark on this and I'm not seeing a  
significant real world change for Facebook's codebase. (definitely  
less than 1%) This was a pretty small test, without a lot of code  
execution, so I  could see other applications doing better.  I'm  
pretty neutral on this  one, it's not a really big change so might  
be worth adding if it's  going to give a few applications out there  
a gain, but I couldn't see  doing this everywhere of course.


-shire



Hi,

Something that could give arrays a boost would be to:

- pool all string literals in each file
- give each stringl iteral an id (unique in the loaded environment)
- bind early all array access with string literals (a common  
occurence), so they don't need to be resolved with a hashmap lookup,  
but rather, a direct stirng literal id lookup.


The benefit of this approach are:

- There's no need to generate a hashmap in the first place
- String literal id-s are unique integers and have no conflicts, so  
no need to do conflict resolution.




This wouldn't really help with the case here of "if ($array1 ==  
$array2)..." though right?  (not to say it's not good, just making  
sure I understand ;-) ).


It sounds like this would only work if the array contents where static  
though, as you're mapping a constant string to the contents of the  
hash (or did I misunderstand and you'd be mapping string const. values  
to hash IDs?).


I am at the point now where my #1 item CPU wise is my hash function.   
I'm working on finding a slightly faster implementation, but I'm also  
playing with having a hash value stored in every zval so that the hash  
function call can be skipped if we've already ran it once.  The  
problem that comes into play here is that there's no central code to  
update or clear the hash if the string or contents of the zval  
change.  Of course, doing this with constant values would be easier to  
accomplish and could be done at compile time, and like you say this is  
a pretty common scenario.


-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] An optimization idea

2008-11-10 Thread shire


On Nov 6, 2008, at 10:37 AM, shire wrote:



However, while the common compare_function call probably wouldent  
gain anything from this, maybe it could be put in the  
TYPE_PAIR(IS_STRING, IS_STRING) case or put into zendi_smart_strcmp  
(which only seems to ever be called by the TYPE_PAIR(IS_STRING,  
IS_STRING) case of compare_function).


I havent played with the patched version of zend_hash_compare to  
see if it might provide any real world savings. Though it appears  
as if that function might see a significant improvement for these  
cases.


Arrays and strings do seem to be the only real worth while  
comparisons for this type of optimization, but only because the  
costs of comparing long strings and large array lists make the  
savings more significant.  I think it might be hard to see this in a  
real world application though, so it would be nice if someone could  
justify the change by showing test results.  I'll try to bench  
Facebook's code base to see what we can get out of it.



I just ran a quick benchmark on this and I'm not seeing a significant  
real world change for Facebook's codebase. (definitely less than 1%)   
This was a pretty small test, without a lot of code execution, so I  
could see other applications doing better.  I'm pretty neutral on this  
one, it's not a really big change so might be worth adding if it's  
going to give a few applications out there a gain, but I couldn't see  
doing this everywhere of course.


-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] An optimization idea

2008-11-06 Thread shire


On Nov 5, 2008, at 7:22 AM, Graham Kelly wrote:


Hi,

I played with a patched version of compare_function with this  
optimization in it. Other then for strings it didn't seem to make  
much of a difference. A quick test showed that in an average php  
application only about 0.15% of calls to compare_function actually  
have the same op1 and op2. Considering it doesn't make much of a  
difference for anything except string comparison (and maybe arrays?)  
it probably isnt worth putting it in compare_function. This  
certinally is not a common case and would only serve to add an extra  
needless comparison and branch to the other 99% of calls to  
compare_function. Also, it seems as if in most cases this type of  
behavior could be detected and handled appropriatly at compile time  
by a good optimizer.


I started looking at this briefly, and agree with what you describe  
above.  I was curious to see if anyone came back with any real world  
applications where they where hitting this scenario but it seems  
unlikely.





However, while the common compare_function call probably wouldent  
gain anything from this, maybe it could be put in the  
TYPE_PAIR(IS_STRING, IS_STRING) case or put into zendi_smart_strcmp  
(which only seems to ever be called by the TYPE_PAIR(IS_STRING,  
IS_STRING) case of compare_function).


I havent played with the patched version of zend_hash_compare to see  
if it might provide any real world savings. Though it appears as if  
that function might see a significant improvement for these cases.


Arrays and strings do seem to be the only real worth while comparisons  
for this type of optimization, but only because the costs of comparing  
long strings and large array lists make the savings more significant.   
I think it might be hard to see this in a real world application  
though, so it would be nice if someone could justify the change by  
showing test results.  I'll try to bench Facebook's code base to see  
what we can get out of it.


-shire


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] An optimization idea

2008-10-20 Thread shire


On Oct 19, 2008, at 12:11 PM, Karoly Negyesi wrote:


Hi,

I think zend_hash_compare could get a healthy speed boost in some
cases if first it would check whether the two variables passed to it
are actually the same (because of "reference counting"). Sorry, my C
skills are way too rusty to write the patch which is likely to be just
a few lines long.




A quick patch/test seems to agree with you, I'd be interested to know  
if you/others  are able to apply this patch and see any real-world  
savings with your web application.  The following was done with a  
debug build of PHP so results are likely exaggerated.  I'm also using  
non-recursive array with 256 byte strings below, results with numeric  
values are less significant of course (approx a 25% gain)...


I'll also look into applying this to some other functions like  
compare_function where it's likely to yield a larger savings in a  
general application.



patch against php-5.2 CVS head
--
iff --git a/Zend/zend_hash.c b/Zend/zend_hash.c
index a1d7071..d11785f 100644
--- a/Zend/zend_hash.c
+++ b/Zend/zend_hash.c
@@ -1327,16 +1327,14 @@ ZEND_API int zend_hash_compare(HashTable *ht1,  
HashTable *ht2, compare_func_t co

IS_CONSISTENT(ht1);
IS_CONSISTENT(ht2);

-   HASH_PROTECT_RECURSION(ht1);
-   HASH_PROTECT_RECURSION(ht2);
-
result = ht1->nNumOfElements - ht2->nNumOfElements;
-   if (result!=0) {
-   HASH_UNPROTECT_RECURSION(ht1);
-   HASH_UNPROTECT_RECURSION(ht2);
+   if (ht1 == ht2 || result!=0) {
return result;
}

+   HASH_PROTECT_RECURSION(ht1);
+   HASH_PROTECT_RECURSION(ht2);
+
p1 = ht1->pListHead;
if (ordered) {
p2 = ht2->pListHead;



simple test script
---
http://www.php.net/unsub.php



Re: [Fwd: [PHP-DEV] PATCH: zend_alloc.c x86_64 optimization]

2007-12-20 Thread Brian Shire


Looks great, thanks!

-shire

On Dec 20, 2007, at 5:22 AM, Dmitry Stogov wrote:


Hi Brain,

I've committed your patch into all branches, because it not only  
improve performance on x86_64, but also fixes a bug in x86 code.


Thank you very much.

Dmitry.


 Original Message 
Subject: [PHP-DEV] PATCH: zend_alloc.c x86_64 optimization
Date: Wed, 28 Nov 2007 14:27:29 -0800
From: Brian Shire <[EMAIL PROTECTED]>
To: internals@lists.php.net
I noticed that there where some x86_64 assembly optimizations missing
from Zend/zend_alloc.c, it only accounts for i386.  The following
patch should add x86_64 support (I don't have karma for Zend of
course, patch is against 5.2.5), I've included a bench mark with a
simple test script to test out memory allocation.  I'd be interested
in getting feedback on this and knowing what other's results are.
Sorry for the custom micro-bench but zend_bench.php did show small
gains, but they where extremely small to really show anything.
A note on the i386 assembly, I had to add the following '&' char to
make sure the input/outputs are using different registers.  Lack of
this had caused some serious memory corruption on my system with -O2
so I'm guessing it might be appropriate to do the same on the i386
code as well, but I've left that out of this patch, perhaps someone
with some assembler experience can verify this is correct?
-: "=a"(res), "=d" (overflow)
+: "=&a"(res), "=&d" (overflow)
Thanks,
-shire

php-5.2.5
---
no-asmasmdelta
-
Run10.9267299180.767436981-0.159292936
Run20.9202699660.768045902-0.152224064
Run30.9186098580.758006096-0.160603762
Total2.7656097412.293488979-0.472120762
Percentage:-17.07%
Index: Zend/zend_alloc.c
===
--- Zend/zend_alloc.c(revision 69567)
+++ Zend/zend_alloc.c(working copy)
@@ -656,6 +656,11 @@
 __asm__("bsrl %1,%0\n\t" : "=r" (n) : "rm"  (_size));
 return n;
+#elif defined(__GNUC__) && defined(__x86_64__)
+unsigned long n;
+
+__asm__("bsrq %1,%0\n\t" : "=r" (n) : "rm" (_size));
+return (unsigned int)n;
 #elif defined(_MSC_VER) && defined(_M_IX86)
 __asm {
 bsr eax, _size
@@ -677,6 +682,11 @@
 __asm__("bsfl %1,%0\n\t" : "=r" (n) : "rm"  (_size));
 return n;
+#elif defined(__GNUC__) && defined(__x86_64__)
+unsigned long n;
+
+__asm__("bsfq %1,%0\n\t" : "=r" (n) : "rm" (_size));
+  return (unsigned int)n;
 #elif defined(_MSC_VER) && defined(_M_IX86)
 __asm {
 bsf eax, _size
@@ -2309,18 +2319,30 @@
 return _zend_mm_block_size(AG(mm_heap), ptr
ZEND_FILE_LINE_RELAY_CC ZEND_FILE_LINE_ORIG_RELAY_CC);
 }
-#if defined(__GNUC__) && defined(i386)
+#if defined(__GNUC__) && (defined(i386) || defined(__x86_64__))
 static inline size_t safe_address(size_t nmemb, size_t size, size_t
offset)
 {
 size_t res = nmemb;
-unsigned long overflow ;
+unsigned long overflow = 0;
-__asm__ ("mull %3\n\taddl %4,%0\n\tadcl $0,%1"
+#if defined(i386)
+__asm__ ("mull %3\n\t"
+"addl %4,%0 \n\t"
+"adcl $0,%1 \n\t"
  : "=a"(res), "=d" (overflow)
  : "%0"(res),
"rm"(size),
"rm"(offset));
+#else /* __x86_64__ */
+__asm__ ("mulq %3 \n\t"
+"addq %4,%0  \n\t"
+"adcq $0,%1  \n\t"
+: "=&a"(res), "=&d" (overflow)
+: "%0"(res),
+"rm"(size),
+"rm"(offset) );
+#endif
 if (UNEXPECTED(overflow)) {
 zend_error_noreturn(E_ERROR, "Possible integer overflow  
in memory

allocation (%zu * %zu + %zu)", nmemb, size, offset);


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] ignored patches

2007-12-03 Thread Brian Shire


On Dec 3, 2007, at 9:16 PM, Gregory Beaver wrote:


Brian Shire wrote:


On Dec 3, 2007, at 2:17 PM, Stanislav Malyshev wrote:


I am a developer on a CMS also which uses the auto-include
functionality to
include many classes over many files. Each request can include  
up to 30

different files.  The speed increase is around the 15% mark when
combining
the files.  This is with APC installed too.


Can you provide some benchmark setups that this could be  
researched -

i.e. describe what was benchmarked and how to reproduce it?



I've seen this come up before internally at Facebook.  Many people  
do a

microtime() test within there code and consider this a definitive
benchmark of how fast there script runs.  Unfortunately this  
excludes a

lot of work that's done prior to execution.  Typically we see people
claiming gains from combining files when in actuality they where just
excluding the compilation time in their benchmark by moving  
compilation
done via includes() to before the initial script begins  
executing.  When
measuring this type of optimization one really must measure  
outside of
PHP using something like an Apache Bench tool so you get an idea  
of the
big picture.  I think trying to optimize these also presumes that  
you're

already running a bytecode cache etc.


Hi Brian and Stas,

I hate to say it, but it is somewhat condescending to assume that the
benchmarks were done with microtime().  I spent about 15 hours of my
time designing a very complex, carefully constructed benchmark, and  
yes,
I ran it with apache benchmark.  In addition, I ran the benchmark  
using

no APC, with APC, and with APC and apc.stat=0.  The benchmark in
question compared require_once to include with full paths to a single
file.  In the best case, I got a 12% performance difference between
include with full paths and apc.stat=0 and a single file.


Hi Greg,

I'm sorry that my message probably did come off as condescending. :- 
(   I really just wanted to share some my *own* pitfalls in case it  
was something that might be helpful here.


If you aren't too put off and you are running APC then perhaps we can  
discuss more off-list as some of our performance problems may be  
similar and I could exchange some optimizations with you to try out.


Again *extremely* sorry for insulting you or anyone else here with my  
not so well thought out email, I'd just like to try to help out a bit.


-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] ignored patches

2007-12-03 Thread Brian Shire


On Dec 3, 2007, at 2:17 PM, Stanislav Malyshev wrote:

I am a developer on a CMS also which uses the auto-include  
functionality to
include many classes over many files. Each request can include up  
to 30
different files.  The speed increase is around the 15% mark when  
combining

the files.  This is with APC installed too.


Can you provide some benchmark setups that this could be researched  
- i.e. describe what was benchmarked and how to reproduce it?




I've seen this come up before internally at Facebook.  Many people do  
a microtime() test within there code and consider this a definitive  
benchmark of how fast there script runs.  Unfortunately this excludes  
a lot of work that's done prior to execution.  Typically we see  
people claiming gains from combining files when in actuality they  
where just excluding the compilation time in their benchmark by  
moving compilation done via includes() to before the initial script  
begins executing.  When measuring this type of optimization one  
really must measure outside of PHP using something like an Apache  
Bench tool so you get an idea of the big picture.  I think trying to  
optimize these also presumes that you're already running a bytecode  
cache etc.


-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] PATCH: zend_alloc.c x86_64 optimization

2007-11-28 Thread Brian Shire


I noticed that there where some x86_64 assembly optimizations missing  
from Zend/zend_alloc.c, it only accounts for i386.  The following  
patch should add x86_64 support (I don't have karma for Zend of  
course, patch is against 5.2.5), I've included a bench mark with a  
simple test script to test out memory allocation.  I'd be interested  
in getting feedback on this and knowing what other's results are.   
Sorry for the custom micro-bench but zend_bench.php did show small  
gains, but they where extremely small to really show anything.


A note on the i386 assembly, I had to add the following '&' char to  
make sure the input/outputs are using different registers.  Lack of  
this had caused some serious memory corruption on my system with -O2  
so I'm guessing it might be appropriate to do the same on the i386  
code as well, but I've left that out of this patch, perhaps someone  
with some assembler experience can verify this is correct?


-   : "=a"(res), "=d" (overflow)
+   : "=&a"(res), "=&d" (overflow)


Thanks,

-shire








php-5.2.5
---
no-asm  asm delta
-
Run10.926729918 0.767436981 -0.159292936
Run20.920269966 0.768045902 -0.152224064
Run30.918609858 0.758006096 -0.160603762
Total   2.765609741 2.293488979 -0.472120762
Percentage: -17.07%




Index: Zend/zend_alloc.c
===
--- Zend/zend_alloc.c   (revision 69567)
+++ Zend/zend_alloc.c   (working copy)
@@ -656,6 +656,11 @@

__asm__("bsrl %1,%0\n\t" : "=r" (n) : "rm"  (_size));
return n;
+#elif defined(__GNUC__) && defined(__x86_64__)
+   unsigned long n;
+
+   __asm__("bsrq %1,%0\n\t" : "=r" (n) : "rm" (_size));
+   return (unsigned int)n;
 #elif defined(_MSC_VER) && defined(_M_IX86)
__asm {
bsr eax, _size
@@ -677,6 +682,11 @@

__asm__("bsfl %1,%0\n\t" : "=r" (n) : "rm"  (_size));
return n;
+#elif defined(__GNUC__) && defined(__x86_64__)
+   unsigned long n;
+
+   __asm__("bsfq %1,%0\n\t" : "=r" (n) : "rm" (_size));
+  return (unsigned int)n;
 #elif defined(_MSC_VER) && defined(_M_IX86)
__asm {
bsf eax, _size
@@ -2309,18 +2319,30 @@
 	return _zend_mm_block_size(AG(mm_heap), ptr  
ZEND_FILE_LINE_RELAY_CC ZEND_FILE_LINE_ORIG_RELAY_CC);

 }

-#if defined(__GNUC__) && defined(i386)
+#if defined(__GNUC__) && (defined(i386) || defined(__x86_64__))

 static inline size_t safe_address(size_t nmemb, size_t size, size_t  
offset)

 {
size_t res = nmemb;
-   unsigned long overflow ;
+   unsigned long overflow = 0;

-   __asm__ ("mull %3\n\taddl %4,%0\n\tadcl $0,%1"
+#if defined(i386)
+   __asm__ (   "mull %3\n\t"
+   "addl %4,%0 \n\t"
+   "adcl $0,%1 \n\t"
 : "=a"(res), "=d" (overflow)
 : "%0"(res),
   "rm"(size),
   "rm"(offset));
+#else /* __x86_64__ */
+   __asm__ (   "mulq %3 \n\t"
+   "addq %4,%0  \n\t"
+   "adcq $0,%1  \n\t"
+   : "=&a"(res), "=&d" (overflow)
+   : "%0"(res),
+   "rm"(size),
+   "rm"(offset) );
+#endif

if (UNEXPECTED(overflow)) {
 		zend_error_noreturn(E_ERROR, "Possible integer overflow in memory  
allocation (%zu * %zu + %zu)", nmemb, size, offset);


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Patch for macros for tracking refcount and is_ref

2007-09-08 Thread shire


On Sep 8, 2007, at 9:31 AM, Antony Dovgal wrote:



If these companies really do worry about this %, why don't they  
participate in development and/or testing?
I don't recall seeing any contributions or even feedback from  
Facebook or alike, do you?


If they wait for a release to test and complain, then I couldn't  
care less - it's just a common
development process, bugs come and bugs go, I see no reasons why we  
should make an exception for

a company that doesn't give a damn about us.


I realize that this conversation isn't really about Facebook, but  
while it's been brought up I'd like to step in and say that Facebook  
is *very* interested in helping with PHP development and other open  
source projects.  We are always trying to run the latest CVS or RC  
versions to make sure we don't find any problems or performance  
losses (although we do occasionally fall behind which is unfortunate  
for us).  If there are any new patches that the developers think we  
could help test out for performance and stability I'm very happy to  
do so.


There have been a handful small contributions to APC, PHP, and  
others, and I'm excited that we get to work on these projects and I  
hope myself and others at Facebook will be able to continue to make  
useful contributions.  I'm also grateful for the contributions others  
have made here, and I hope Facebook and myself can establish itself  
as an entity dedicated to giving back as well.  If you have other  
questions or suggestions please feel free to email me, I know I've  
limited my posts here so as not to add anything unnecessary but we  
are here and ready to help.


Thanks,

-shire
 Facebook Inc.
 [EMAIL PROTECTED]

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] zend_mm_set_heap()

2007-04-25 Thread Brian Shire

Thanks Dmitry,

I think what I was going for (in a very poorly worded way) was that  
I'm unable to access any of the heap structure values outside of  
zend_alloc.c because that's where the definition resides.


For example something like the following should generate a compile  
time error outside zend_alloc.c:


zend_mm_heap *heap;
heap->overflow=0;

I've attached a rough patch of what I did to make this accessible  
outside zend_alloc.c.


-shire

On Apr 25, 2007, at 12:01 AM, Dmitry Stogov wrote:


You probably missed.
This function can be used to substitute main PHP heap, so all  
emalloc()

function will work with new one.

Thanks. Dmitry.


-Original Message-
From: Brian Shire [mailto:[EMAIL PROTECTED]
Sent: Wednesday, April 25, 2007 1:22 AM
To: PHP internals
Subject: [PHP-DEV] zend_mm_set_heap()



It seems like zend_mm_set_heap() isn't very useful outside
zend_alloc.c because the zend_mm_heap structure definition is within
the zend_alloc.c file rather than zend_alloc.h.  Could we move the AG
() and associated structures to the header file, or perhaps I missed
something?

- Shire
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php








-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

[PHP-DEV] zend_mm_set_heap()

2007-04-24 Thread Brian Shire


It seems like zend_mm_set_heap() isn't very useful outside  
zend_alloc.c because the zend_mm_heap structure definition is within  
the zend_alloc.c file rather than zend_alloc.h.  Could we move the AG 
() and associated structures to the header file, or perhaps I missed  
something?


- Shire
[EMAIL PROTECTED]
[EMAIL PROTECTED]

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Trapping "memory exhausted" error

2007-04-17 Thread Brian Shire


I have code at Facebook that currently does something similar by  
catching fatal errors but there's a lot of gotcha's obviously that  
need to be handled properly when the engine is in this state.  When  
possible we enable the printing of backtraces and other debug  
information, however you run the risk that instead of a nice  
backtrace you get a nice "Segmentation fault" error instead.  This  
can be wrapped up in an extension and is all C code functionality,  
not exactly a soft memory limit so it's a little different than  
what's being discussed, but it might be another option to what you're  
looking for.  I'm happy to share with a few initially who would find  
it useful/give feedback.


-shire

On Apr 16, 2007, at 3:19 AM, David Sklar wrote:


I am interested in being able to trap the (currently) fatal error that
results when memory usage exceeds the defined memory limit.

I was thinking it could work as follows:

- in addition to a memory_limit configuration directive, there could
be a "memory_limit_grace" configuration directive. This gets stored in
the struct _zend_mm_heap, along with the limit.

- Also added to struct _zend_mm_heap is a "initial limit reached" flag

- When zend_mm_safe_error() in Zend/zend_alloc.c is invoked under the
current conditions (memory_limit exceeded), it sets the "initial limit
reached" flag, adjusts the heap limit to the "memory_limit_grace"
value and throws some non-fatal error (or an exception if it is
feasible from here.)

- When zend_mm_safe_error() is invoked and the "initial limit reached"
flag is already set, it throws the fatal error exactly as it does now.

This "grace period" would provide a way to gracefully exit when a
memory limit is reached, but also has the hard limit still enforced so
that code which is supposed to be gracefully exiting doesn't chew up
too much additional memory.

Is it feasible to adjust the heap limit and throw a non-fatal error
from within zend_mm_safe_error()?

David

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] [PATCH] segfault with string dimensions

2007-03-07 Thread Brian Shire
mp;& !0) {
	(*object_ptr)->refcount++;  /* undo the effect of  
get_obj_zval_ptr_ptr() */

}
-   if (Z_TYPE_PP(object_ptr) == IS_OBJECT) {
+   if (object_ptr && Z_TYPE_PP(object_ptr) == 
IS_OBJECT) {
	return zend_binary_assign_op_obj_helper_SPEC_UNUSED_CV 
(binary_op, ZEND_OPCODE_HANDLER_ARGS_PASSTHRU);

} else {
zend_op *op_data = opline+1;
@@ -20495,11 +20495,11 @@
case ZEND_ASSIGN_DIM: {
zval **object_ptr = _get_zval_ptr_ptr_cv(&opline->op1, EX(Ts),  
BP_VAR_W TSRMLS_CC);

-   if (IS_CV != IS_CV && !0) {
+   if (object_ptr && IS_CV != IS_CV && !0) {
	(*object_ptr)->refcount++;  /* undo the effect of  
get_obj_zval_ptr_ptr() */

}
-   if (Z_TYPE_PP(object_ptr) == IS_OBJECT) {
+   if (object_ptr && Z_TYPE_PP(object_ptr) == 
IS_OBJECT) {
	return zend_binary_assign_op_obj_helper_SPEC_CV_CONST(binary_op,  
ZEND_OPCODE_HANDLER_ARGS_PASSTHRU);

} else {
zend_op *op_data = opline+1;
@@ -21971,11 +21971,11 @@
case ZEND_ASSIGN_DIM: {
zval **object_ptr = _get_zval_ptr_ptr_cv(&opline->op1, EX(Ts),  
BP_VAR_W TSRMLS_CC);

-   if (IS_CV != IS_CV && !0) {
+   if (object_ptr && IS_CV != IS_CV && !0) {
	(*object_ptr)->refcount++;  /* undo the effect of  
get_obj_zval_ptr_ptr() */

}
-   if (Z_TYPE_PP(object_ptr) == IS_OBJECT) {
+   if (object_ptr && Z_TYPE_PP(object_ptr) == 
IS_OBJECT) {
	return zend_binary_assign_op_obj_helper_SPEC_CV_TMP(binary_op,  
ZEND_OPCODE_HANDLER_ARGS_PASSTHRU);

} else {
zend_op *op_data = opline+1;
@@ -23451,11 +23451,11 @@
case ZEND_ASSIGN_DIM: {
zval **object_ptr = _get_zval_ptr_ptr_cv(&opline->op1, EX(Ts),  
BP_VAR_W TSRMLS_CC);

-   if (IS_CV != IS_CV && !0) {
+   if (object_ptr && IS_CV != IS_CV && !0) {
	(*object_ptr)->refcount++;  /* undo the effect of  
get_obj_zval_ptr_ptr() */

}
-   if (Z_TYPE_PP(object_ptr) == IS_OBJECT) {
+   if (object_ptr && Z_TYPE_PP(object_ptr) == 
IS_OBJECT) {
	return zend_binary_assign_op_obj_helper_SPEC_CV_VAR(binary_op,  
ZEND_OPCODE_HANDLER_ARGS_PASSTHRU);

} else {
zend_op *op_data = opline+1;
@@ -24734,11 +24734,11 @@
case ZEND_ASSIGN_DIM: {
zval **object_ptr = _get_zval_ptr_ptr_cv(&opline->op1, EX(Ts),  
BP_VAR_W TSRMLS_CC);

-   if (IS_CV != IS_CV && !0) {
+   if (object_ptr && IS_CV != IS_CV && !0) {
	(*object_ptr)->refcount++;  /* undo the effect of  
get_obj_zval_ptr_ptr() */

}
-   if (Z_TYPE_PP(object_ptr) == IS_OBJECT) {
+   if (object_ptr && Z_TYPE_PP(object_ptr) == 
IS_OBJECT) {
	return zend_binary_assign_op_obj_helper_SPEC_CV_UNUSED 
(binary_op, ZEND_OPCODE_HANDLER_ARGS_PASSTHRU);

} else {
zend_op *op_data = opline+1;
@@ -25402,11 +25402,11 @@
case ZEND_ASSIGN_DIM: {
zval **object_ptr = _get_zval_ptr_ptr_cv(&opline->op1, EX(Ts),  
BP_VAR_W TSRMLS_CC);

-       if (IS_CV != IS_CV && !0) {
+   if (object_ptr && IS_CV != IS_CV && !0) {
	(*object_ptr)->refcount++;  /* undo the effect of  
get_obj_zval_ptr_ptr() */

}
-   if (Z_TYPE_PP(object_ptr) == IS_OBJECT) {
+   if (object_ptr && Z_TYPE_PP(object_ptr) == 
IS_OBJECT) {
	return zend_binary_assign_op_obj_helper_SPEC_CV_CV(binary_op,  
ZEND_OPCODE_HANDLER_ARGS_PASSTHRU);

} else {
zend_op *op_data = opline+1;





- Shire
[EMAIL PROTECTED]
[EMAIL PROTECTED]

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] extract() optimization

2006-12-09 Thread Brian Shire


This patch should safely remove what appears to be unnecessary strlen 
() calls for php_valid_var_name() used by extract().


diff --git a/ext/standard/array.c b/ext/standard/array.c
index fad4bf2..401957f 100644
--- a/ext/standard/array.c
+++ b/ext/standard/array.c
@@ -1270,15 +1270,13 @@ PHP_FUNCTION(array_search)
/* }}} */
-static int php_valid_var_name(char *var_name)
+static int php_valid_var_name(char *var_name, uint len)
{
-   int len, i;
+   int i;

if (!var_name)
return 0;

-   len = strlen(var_name);
-
if (!isalpha((int)((unsigned char *)var_name)[0]) && var_name 
[0] != '_')

return 0;

@@ -1409,7 +1407,7 @@ PHP_FUNCTION(extract)
case EXTR_PREFIX_INVALID:
if (final_name.len == 0) {
-   if (!php_valid_var_name 
(var_name)) {
+   if (!php_valid_var_name 
(var_name, var_name_len)) {
smart_str_appendl 
(&final_name, Z_STRVAL_PP(prefix), Z_STRLEN_PP(prefix));
smart_str_appendc 
(&final_name, '_');
smart_str_appendl 
(&final_name, var_name, var_name_len);

@@ -1426,7 +1424,7 @@ PHP_FUNCTION(extract)
if (final_name.len) {
smart_str_0(&final_name);
-   if (php_valid_var_name(final_name.c)) {
+   if (php_valid_var_name(final_name.c,  
final_name.len)) {

if (extract_refs) {
    zval **orig_var;




-Brian Shire
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]
 aim: int80h

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] make test flag?

2006-12-09 Thread Brian Shire


Hello Marcus,

On Dec 4, 2006, at 11:37 AM, Marcus Boerger wrote:


Hello Luca,

  we cannot do that. This patchwould prevent loading of any shared
extension. And we decided against using dl() in the test scripts  
some years
ago. The only thing we can do here is having a new make thing that  
does pass

-n to the test script.


Does this mean a patch should be re-worked so that there's a second  
type of 'make test' that includes the -n option, that we want to  
include the -n in the default 'make test', or that we just need a  
different solution completely (or none at all)?  Another more  
complicated solution that I thought of but didn't explore is to  
provide the ability to ignore duplicate extensions for the make  
tests.  Although this may complicate the php cli for something that  
is a pretty specific scenario.



Once again, we test what you will be using, not only
part of it. If a deployment system hass a problemwith that we need  
tofind
otehr solutions. In fact we already have anenvironment variable  
today that
serves the purpose. Simply set TEST_PHP_ARGS=-n and be done (if you  
know

what it really does) :-)


Didn't realize this existed, would it be preferred to try and have a  
DSO extensions set TEST_PHP_ARGS to avoid this situation?


Thanks,


-Brian Shire
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]
 aim: int80h

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] INI Includes

2006-12-06 Thread Brian Shire


I just read the thread regarding support for INI Includes at: http:// 
marc2.theaimsgroup.com/?t=10329000458&r=1&w=2


I had made another patch for this before I saw the post (A little bit  
of a different method though, via the php callback).  Having this  
support helps us simplify larger configurations especially when using  
a configuration management tool.  Just wondering if this is a  
confirmed "not going to happen" item or if it's still something that  
may be of interest.


Thanks,

-Brian Shire
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]
 aim: int80h

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] make test flag?

2006-12-01 Thread Brian Shire


On Dec 1, 2006, at 3:21 PM, Marcus Boerger wrote:


Hello Brian,

  why should it do that? run-tests.php doesn't do anything with  
extensions.

  Meaning it only tests when you load your .so's by ini.




  $(PHP_EXECUTABLE) -d 'open_basedir=' -d 'safe_mode=0' -d  
'output_buffering=0' -d 'memory_limit=-1' $(top_srcdir)/run-tests.php  
-d 'extension_dir=modules/' -d `( . $(PHP_MODULES) ; echo extension=$ 
$dlname)` tests/; \


Given this only affects a make test being done in an extension that  
has been setup with a "phpize" command.  Let me know if I've  
overlooked anything.  I believe the above line that executes the  
"make test" causes two minor problems for ini files loading DSO's.   
First it sets the "extension_dir" to modules. Any attempt to load  
modules from installed ini files will fail because the path is now  
different.  The second is that the last portion loads the extension  
by doing a "-d extension=" on the php commandline, which I  
causes my "already loaded error".





best regards
marcus

Saturday, December 2, 2006, 12:16:45 AM, you wrote:


On Dec 1, 2006, at 12:42 AM, Marcus Boerger wrote:



Hello Brian,

  therun-tests.php command will take care of any INI settings that
are known
to break tests.So if anyof your INI settings cause tests to failwe
simply
need to include them into the tests. Othewise it is important to
test with
your INI settings asyou want to know if your PHP installation
works. Instead
of knowing whether PHP in general could work.




Thanks Marcus,



The actual situation I'm thinking of is when you are running make
test on an extension being built as a DSO that is already configured
to load in an ini file.  When make test is ran it will attempt to
load the DSO twice, and gives: "PHP Warning:  Module 'apc' already
loaded in Unknown on line 0".



I thought the preference would be to ignore installed ini files, but
if that's not the intention, then my proposed patch of course is bad.




-Brian Shire
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
  aim: int80h





Best regards,
 Marcus



-Brian Shire
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]
 aim: int80h

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] make test flag?

2006-12-01 Thread Brian Shire


On Dec 1, 2006, at 12:42 AM, Marcus Boerger wrote:


Hello Brian,

  therun-tests.php command will take care of any INI settings that  
are known
to break tests.So if anyof your INI settings cause tests to failwe  
simply
need to include them into the tests. Othewise it is important to  
test with
your INI settings asyou want to know if your PHP installation  
works. Instead

of knowing whether PHP in general could work.



Thanks Marcus,

The actual situation I'm thinking of is when you are running make  
test on an extension being built as a DSO that is already configured  
to load in an ini file.  When make test is ran it will attempt to  
load the DSO twice, and gives: "PHP Warning:  Module 'apc' already  
loaded in Unknown on line 0".


I thought the preference would be to ignore installed ini files, but  
if that's not the intention, then my proposed patch of course is bad.



-Brian Shire
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]
 aim: int80h

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] make test flag?

2006-11-30 Thread Brian Shire


I noticed the make test command may need the "-n" flag added so that  
it will exclude already installed ini files.  Specifically this  
causes a problem when running make test on a system that already has  
the same DSO installed so it tries to load it twice (once via the  
installed ini and again for the test).   Perhaps there's other places  
where this causes problems? ...



"  -n   No php.ini file will be used"


diff --git a/Makefile.global b/Makefile.global
index ec86927..237c1a6 100644
--- a/Makefile.global
+++ b/Makefile.global
@@ -74,7 +74,7 @@ test: all
TEST_PHP_EXECUTABLE=$(PHP_EXECUTABLE) \
TEST_PHP_SRCDIR=$(top_srcdir) \
CC="$(CC)" \
-			$(PHP_EXECUTABLE) -d 'open_basedir=' -d 'safe_mode=0' -d  
'output_buffering=0' -d 'memory_limit=-1' $(top_srcdir)/run-tests.php  
-d 'extension_dir=modules/' -d `( . $(PHP_MODULES) ; echo extension=$ 
$dlname)` tests/; \
+			$(PHP_EXECUTABLE) -d 'open_basedir=' -d 'safe_mode=0' -d  
'output_buffering=0' -d 'memory_limit=-1' $(top_srcdir)/run-tests.php  
-n -d 'extension_dir=modules/' -d `( . $(PHP_MODULES) ; echo  
extension=$$dlname)` tests/; \

elif test ! -z "$(SAPI_CLI_PATH)" && test -x "$(SAPI_CLI_PATH)"; then \
TEST_PHP_EXECUTABLE=$(top_builddir)/$(SAPI_CLI_PATH) \
TEST_PHP_SRCDIR=$(top_srcdir) \




-Brian Shire
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]
 aim: int80h

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Anyone know why this isn't safe?

2006-09-28 Thread Brian Shire
Ran into a problem with the following APC code change.  I'm not  
seeing the error message on duplicate function definitions in  
different files when compiled/cached individually then both included  
in another file.  ie: file a.php and file b.php both contain the  
function foo().  If I request and cache in APC file a.php and b.php  
individually but then include them both in c.php, I won't get a  
duplicate function error (as it's commented out in APC).  Can someone  
help fill me in on why this isn't always valid to have this error here?


Commit message was
"This check isn't always valid here"
http://cvs.php.net/viewvc.cgi/pecl/apc/apc_main.c?r1=3.65&r2=3.66

--- apc_main.c  2005/12/15 08:20:07 3.65
+++ apc_main.c  2005/12/15 17:22:04 3.66
@@ -28,7 +28,7 @@

  */

-/* $Id: apc_main.c,v 3.65 2005/12/15 08:20:07 rasmus Exp $ */
+/* $Id: apc_main.c,v 3.66 2005/12/15 17:22:04 rasmus Exp $ */

 #include "apc_php.h"
 #include "apc_main.h"
@@ -79,7 +79,7 @@
   NULL);

 if (status == FAILURE) {
-zend_error(E_ERROR, "Cannot redeclare %s()", fn.name);
+/* zend_error(E_ERROR, "Cannot redeclare %s()", fn.name); */
 }

 return status;


-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] CVS Account Request: shire

2006-03-22 Thread Brian Shire
Access to pecl/apc to committ bug fixes/modifications directly.  Recieved 
invitation from Rasums.

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php