Hi

On 2/16/22 12:39, Go Kudo wrote:
  Is the nextByteSize() method actually required? PHP strings already know
their own length.

This is a convenience of the current implementation.

You already said that you will think of some good ideas, but I'd like to be clear that the convenience of the internal implementation should not be something that affects the user-facing implementation.

In fact with the current implementation I can trivially create a memory-unsafety bug:

    <?php

    use Random\Engine;
    use Random\Randomizer;

    final class Bug implements Engine {
        public function generate(): string
        {
            return '';
        }

        public function nextByteSize(): int {
            return 7;
        }
    }

    $e = new Bug();
    $g = new Randomizer($e);

    var_dump(\bin2hex($g->getBytes(8)));

Results in:

    ==116755== Use of uninitialised value of size 8
    ==116755==    at 0x6180C8: php_bin2hex (string.c:111)
    ==116755==    by 0x6185D2: zif_bin2hex (string.c:220)
    ==116755==    by 0x79BDB4: ZEND_DO_ICALL_SPEC_RETVAL_USED_HANDLER 
(zend_vm_execute.h:1312)
    ==116755==    by 0x8194F0: execute_ex (zend_vm_execute.h:55503)
    ==116755==    by 0x81ED86: zend_execute (zend_vm_execute.h:59858)
    ==116755==    by 0x75A923: zend_execute_scripts (zend.c:1744)
    ==116755==    by 0x69C8C4: php_execute_script (main.c:2535)
    ==116755==    by 0x8E0B19: do_cli (php_cli.c:965)
    ==116755==    by 0x8E1DF9: main (php_cli.c:1367)
==116755== ==116755== Use of uninitialised value of size 8
    ==116755==    at 0x618100: php_bin2hex (string.c:112)
    ==116755==    by 0x6185D2: zif_bin2hex (string.c:220)
    ==116755==    by 0x79BDB4: ZEND_DO_ICALL_SPEC_RETVAL_USED_HANDLER 
(zend_vm_execute.h:1312)
    ==116755==    by 0x8194F0: execute_ex (zend_vm_execute.h:55503)
    ==116755==    by 0x81ED86: zend_execute (zend_vm_execute.h:59858)
    ==116755==    by 0x75A923: zend_execute_scripts (zend.c:1744)
    ==116755==    by 0x69C8C4: php_execute_script (main.c:2535)
    ==116755==    by 0x8E0B19: do_cli (php_cli.c:965)
    ==116755==    by 0x8E1DF9: main (php_cli.c:1367)
==116755== string(16) "0000000000000000"

For userland implementations you really must derive the size from the returned bytestring. Otherwise it's easy for a developer to create an implementation where nextByteSize() and generate() disagree. Even if the memory-unsafety is fixed, this will result in frustration for the user.

For native implementations you can keep some explicit width in the code, if that helps with performance.

Best regards
Tim Düsterhus

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php

Reply via email to