#47815 [Asn]: Compile time computation of hash value decrease hash lookup time.

2009-03-30 Thread dmitry
 ID:   47815
 Updated by:   dmi...@php.net
 Reported By:  basant dot kukreja at gmail dot com
 Status:   Assigned
 Bug Type: Performance problem
 Operating System: Solaris 10
 PHP Version:  5.2.9
 Assigned To:  dmitry
 New Comment:

I like the idea, but I don't like some implementation details (addition
field in zend_opcode, tricky zend_std_read_property_hash() invocation).
However, I know that it's probably not possible to do it better.

I would like to make some benchmark tests, but the patch is broken.
Could you please put the patch somewhere or send it by email.


Previous Comments:


[2009-03-27 23:02:12] basant dot kukreja at gmail dot com

Some signifiant percentage of the time is spent in calculating the hash
value
of string contants.

If we compute the hash value of string constants during compilation
then
lookup time can be improved a lot.

With the above submitted patch results are better : 
Excl. Incl. Name  
User CPU  User CPU   
sec.  sec. 
 414.450   726.638 zend_fetch_dimension_address
 74.922238.016  zend_get_property_info_hval

Note the 150 second (~20 % time) less time spent in
zend_fetch_dimension_address and 190 second (45% time) reduction in
zend_get_property_info.

It showed 1% performance overall.



[2009-03-27 22:59:55] basant dot kukreja at gmail dot com

diff -r 00438f7eebe4 php-5.2.9RC3/Zend/zend_compile.h
--- a/php-5.2.9RC3/Zend/zend_compile.h  Tue Mar 17 11:27:02 2009 -0700
+++ b/php-5.2.9RC3/Zend/zend_compile.h  Fri Mar 27 10:18:13 2009 -0700
@@ -83,6 +83,7 @@
znode op1;
znode op2;
ulong extended_value;
+   ulong hval;
uint lineno;
zend_uchar opcode;
 };
diff -r 00438f7eebe4 php-5.2.9RC3/Zend/zend_execute.c
--- a/php-5.2.9RC3/Zend/zend_execute.c  Tue Mar 17 11:27:02 2009 -0700
+++ b/php-5.2.9RC3/Zend/zend_execute.c  Fri Mar 27 10:18:13 2009 -0700
@@ -930,11 +930,12 @@
return NULL;
 }
 
-static inline zval **zend_fetch_dimension_address_inner(HashTable *ht,
zval *dim, int type TSRMLS_DC)
+static inline zval **zend_fetch_dimension_address_inner(HashTable *ht,
zval *dim, int type, zend_ulong hval, int usehval TSRMLS_DC)
 {
zval **retval;
char *offset_key;
int offset_key_length;
+   int ret;
 
switch (dim-type) {
case IS_NULL:
@@ -948,7 +949,13 @@
offset_key_length = dim-value.str.len;

 fetch_string_dim:
-   if (zend_symtable_find(ht, offset_key, 
offset_key_length+1, (void
**) retval) == FAILURE) {
+   if (usehval) {
+   ret = zend_symtable_quick_find(ht, offset_key,
offset_key_length+1, hval, (void **) retval);
+   }
+   else {
+   ret = zend_symtable_find(ht, offset_key, 
offset_key_length+1,
(void **) retval);
+   }
+   if (ret == FAILURE) {
switch (type) {
case BP_VAR_R:
zend_error(E_NOTICE, Undefined 
index:  %s, offset_key);
@@ -1023,7 +1030,7 @@
return retval;
 }
 
-static void zend_fetch_dimension_address(temp_variable *result, zval
**container_ptr, zval *dim, int dim_is_tmp_var, int type TSRMLS_DC)
+static void zend_fetch_dimension_address(temp_variable *result, zval
**container_ptr, zval *dim, int dim_is_tmp_var, int type, zend_ulong
hval, int usehval TSRMLS_DC)
 {
zval *container;
 
@@ -1078,7 +1085,7 @@
new_zval-refcount--;
}
} else {
-   retval = 
zend_fetch_dimension_address_inner(Z_ARRVAL_P(container),
dim, type TSRMLS_CC);
+   retval = 
zend_fetch_dimension_address_inner(Z_ARRVAL_P(container),
dim, type, hval, usehval TSRMLS_CC);
}
if (result) {
result-var.ptr_ptr = retval;
diff -r 00438f7eebe4 php-5.2.9RC3/Zend/zend_hash.h
--- a/php-5.2.9RC3/Zend/zend_hash.h Tue Mar 17 11:27:02 2009 -0700
+++ b/php-5.2.9RC3/Zend/zend_hash.h Fri Mar 27 10:18:13 2009 -0700
@@ -354,6 +354,12 @@
return zend_hash_find(ht, arKey, nKeyLength, pData);
 }
 
+static inline int zend_symtable_quick_find(HashTable *ht, char *arKey,
uint nKeyLength, ulong h, void **pData)
+{
+   HANDLE_NUMERIC(arKey, nKeyLength, zend_hash_index_find(ht, idx,
pData));
+   return zend_hash_quick_find(ht, arKey, nKeyLength, h, pData);
+}
+
 
 static inline int zend_symtable_exists(HashTable *ht, char *arKey,
uint nKeyLength)
 {
diff -r 00438f7eebe4 php-5.2.9RC3/Zend/zend_object_handlers.c

#47815 [Asn]: Compile time computation of hash value decrease hash lookup time.

2009-03-30 Thread basant dot kukreja at gmail dot com
 ID:   47815
 User updated by:  basant dot kukreja at gmail dot com
 Reported By:  basant dot kukreja at gmail dot com
 Status:   Assigned
 Bug Type: Performance problem
 Operating System: Solaris 10
 PHP Version:  5.2.9
 Assigned To:  dmitry
 New Comment:

Regarding adding new field in opcode :
   Ideally I think it would be best if we could have used
extended_value for
the hash value but some of the relevant opcodes were already using this
field.
I understand that adding one field in opcode can cause reduction in
performance in some scenarios.

Other possible solutions which come to my mind is (which is kind of
hack):
* Encode hash value inside string e.g
zval.value.str.val = abc\0hash_value
zval.value.str.len = 3

-
Regarding using zend_std_read_property_hval :
   Ideally I would like to add a parameter in zend_std_read_property
but it
would break other object handlers. One possibility is to add a
function
pointer in _zend_object_handlers e.g 


typedef zval *(*zend_object_read_property_hash_t)(zval *object, zval
*member, hlong hval, int type TSRMLS_DC);

struct _zend_object_handlers {
...

zend_object_read_property_hash_t
read_property_hash;
};

And then in opcode handler, we can write :

if ((OP2_TYPE == IS_CONST) 
Z_OBJ_HT_P(container)-read_property_hash)
{
*retval = 
Z_OBJ_HT_P(container)-read_property_hash(container,
offset,

  EX(opline)-hval,

  type TSRMLS_CC);
} else {
*retval = 
Z_OBJ_HT_P(container)-read_property(container, offset,
type TSRMLS_CC);
}


Previous Comments:


[2009-03-30 12:48:04] dmi...@php.net

I like the idea, but I don't like some implementation details (addition
field in zend_opcode, tricky zend_std_read_property_hash() invocation).
However, I know that it's probably not possible to do it better.

I would like to make some benchmark tests, but the patch is broken.
Could you please put the patch somewhere or send it by email.



[2009-03-27 23:02:12] basant dot kukreja at gmail dot com

Some signifiant percentage of the time is spent in calculating the hash
value
of string contants.

If we compute the hash value of string constants during compilation
then
lookup time can be improved a lot.

With the above submitted patch results are better : 
Excl. Incl. Name  
User CPU  User CPU   
sec.  sec. 
 414.450   726.638 zend_fetch_dimension_address
 74.922238.016  zend_get_property_info_hval

Note the 150 second (~20 % time) less time spent in
zend_fetch_dimension_address and 190 second (45% time) reduction in
zend_get_property_info.

It showed 1% performance overall.



[2009-03-27 22:59:55] basant dot kukreja at gmail dot com

diff -r 00438f7eebe4 php-5.2.9RC3/Zend/zend_compile.h
--- a/php-5.2.9RC3/Zend/zend_compile.h  Tue Mar 17 11:27:02 2009 -0700
+++ b/php-5.2.9RC3/Zend/zend_compile.h  Fri Mar 27 10:18:13 2009 -0700
@@ -83,6 +83,7 @@
znode op1;
znode op2;
ulong extended_value;
+   ulong hval;
uint lineno;
zend_uchar opcode;
 };
diff -r 00438f7eebe4 php-5.2.9RC3/Zend/zend_execute.c
--- a/php-5.2.9RC3/Zend/zend_execute.c  Tue Mar 17 11:27:02 2009 -0700
+++ b/php-5.2.9RC3/Zend/zend_execute.c  Fri Mar 27 10:18:13 2009 -0700
@@ -930,11 +930,12 @@
return NULL;
 }
 
-static inline zval **zend_fetch_dimension_address_inner(HashTable *ht,
zval *dim, int type TSRMLS_DC)
+static inline zval **zend_fetch_dimension_address_inner(HashTable *ht,
zval *dim, int type, zend_ulong hval, int usehval TSRMLS_DC)
 {
zval **retval;
char *offset_key;
int offset_key_length;
+   int ret;
 
switch (dim-type) {
case IS_NULL:
@@ -948,7 +949,13 @@
offset_key_length = dim-value.str.len;

 fetch_string_dim:
-   if (zend_symtable_find(ht, offset_key, 
offset_key_length+1, (void
**) retval) == FAILURE) {
+   if (usehval) {
+   ret = zend_symtable_quick_find(ht, offset_key,
offset_key_length+1, hval, (void **) retval);
+   }
+   else {
+   ret = zend_symtable_find(ht, offset_key, 
offset_key_length+1,
(void **) retval);
+   }
+   if (ret == FAILURE) {
switch (type) {