From: Operating system: PHP version: Irrelevant Package: Unknown/Other Function Bug Type: Feature/Change Request Bug description:Tokenize multiple strings in parallel is impossible using strtok
Description: ------------ Hi there, I have some very long strings (they come from a MySQL query selecting multiple GROUP_CONCAT-enated fields) that I would like to tokenize quickly using the strtok function, but I cannot do that in parallel as I need. That is fetching the first token from all strings, then the second token, etc. E.g. given: $a = '1,2,3,4'; $b = 'a,b,c,d'; strtok($a, ',') returns the first token of $a. If I then do strtok($b, ',') it will return the first token of $b, so I have '1' and 'a' together for the first iteration. In the next iteration I would need '2' and 'b', etc, but strtok(',') will only give me 'b', there is no way to fetch '2' from $a, since strtok can only be used on one string at a time. I know I could do it implementing my own algorithm that accesses strings character by character, but that would be slow. I have tried using explode(',', $a/$b) but that uses too much memory, since it generates very big intermediate arrays, my strings are very long and numerous. Would it be possible to have a tokenizer facility that does not store the string to tokenize, but only the index of the last processed character, so that it can be used on multiple strings? I know that maybe I shouldn't use PHP for such tasks but rather code in C, but PHP is so quick to develop with that I cannot resist :) I understand that strtok has been engineered this way to avoid passing the string over and over, which would be slow for long strings due to pass by value, therefore I would like to have another tokenizer function that accepts a reference to a string and always two arguments. -- Edit bug report at https://bugs.php.net/bug.php?id=55261&edit=1 -- Try a snapshot (PHP 5.4): https://bugs.php.net/fix.php?id=55261&r=trysnapshot54 Try a snapshot (PHP 5.3): https://bugs.php.net/fix.php?id=55261&r=trysnapshot53 Try a snapshot (trunk): https://bugs.php.net/fix.php?id=55261&r=trysnapshottrunk Fixed in SVN: https://bugs.php.net/fix.php?id=55261&r=fixed Fixed in SVN and need be documented: https://bugs.php.net/fix.php?id=55261&r=needdocs Fixed in release: https://bugs.php.net/fix.php?id=55261&r=alreadyfixed Need backtrace: https://bugs.php.net/fix.php?id=55261&r=needtrace Need Reproduce Script: https://bugs.php.net/fix.php?id=55261&r=needscript Try newer version: https://bugs.php.net/fix.php?id=55261&r=oldversion Not developer issue: https://bugs.php.net/fix.php?id=55261&r=support Expected behavior: https://bugs.php.net/fix.php?id=55261&r=notwrong Not enough info: https://bugs.php.net/fix.php?id=55261&r=notenoughinfo Submitted twice: https://bugs.php.net/fix.php?id=55261&r=submittedtwice register_globals: https://bugs.php.net/fix.php?id=55261&r=globals PHP 4 support discontinued: https://bugs.php.net/fix.php?id=55261&r=php4 Daylight Savings: https://bugs.php.net/fix.php?id=55261&r=dst IIS Stability: https://bugs.php.net/fix.php?id=55261&r=isapi Install GNU Sed: https://bugs.php.net/fix.php?id=55261&r=gnused Floating point limitations: https://bugs.php.net/fix.php?id=55261&r=float No Zend Extensions: https://bugs.php.net/fix.php?id=55261&r=nozend MySQL Configuration Error: https://bugs.php.net/fix.php?id=55261&r=mysqlcfg