ID: 33093
Updated by: [EMAIL PROTECTED]
Reported By: [EMAIL PROTECTED]
-Status: Open
+Status: Feedback
Bug Type: Unknown/Other Function
Operating System: Mac OS X 10.4.1
PHP Version: 5.0.4
New Comment:
wheres the missing data?
php -r 'var_dump(token_get_all("<?php echo \$var ?>"));'
array(6) {
[0]=>
array(2) {
[0]=>
int(366)
[1]=>
string(6) "<?php "
}
[1]=>
array(2) {
[0]=>
int(316)
[1]=>
string(4) "echo"
}
[2]=>
array(2) {
[0]=>
int(369)
[1]=>
string(1) " "
}
[3]=>
array(2) {
[0]=>
int(309)
[1]=>
string(4) "$var"
}
[4]=>
array(2) {
[0]=>
int(369)
[1]=>
string(1) " "
}
[5]=>
array(2) {
[0]=>
int(368)
[1]=>
string(2) "?>"
}
}
php -r 'var_dump(token_get_all("<?php \necho \$var\n?>"));'
array(7) {
[0]=>
array(2) {
[0]=>
int(366)
[1]=>
string(6) "<?php "
}
[1]=>
array(2) {
[0]=>
int(369)
[1]=>
string(1) "
"
}
[2]=>
array(2) {
[0]=>
int(316)
[1]=>
string(4) "echo"
}
[3]=>
array(2) {
[0]=>
int(369)
[1]=>
string(1) " "
}
[4]=>
array(2) {
[0]=>
int(309)
[1]=>
string(4) "$var"
}
[5]=>
array(2) {
[0]=>
int(369)
[1]=>
string(1) "
"
}
[6]=>
array(2) {
[0]=>
int(368)
[1]=>
string(2) "?>"
}
Previous Comments:
------------------------------------------------------------------------
[2005-05-21 18:40:38] [EMAIL PROTECTED]
Description:
------------
It appears that token_get_all() does not report T_OPEN_TAG and
T_WHITESPACE properly, depending on the whitespace following the
opening tag. For example, when parsing ...
<?php echo $var ?>
... you get T_OPEN_TAG, T_ECHO, T_WHITESPACE, T_VAR, T_WHITESPACE, and
T_CLOSE_TAG. This is not entirely the expected result (I would expect
T_WHITESPACE between the open tag and the echo).
However, when parsing the functional equivalent...
<?php
echo $var
?>
you get "<", "?", T_STRING ("php"), T_WHITESPACE, T_ECHO, T_WHITESPACE,
T_VAR, T_WHITESPACE, and T_CLOSE_TAG. In addition, the first whitespace
value reported does not include all the newlines (it drops one).
Although Macs use \r for their newlines natively, the test code uses
the Unix-standard \n, so I don't think it's Mac-related.
If this is in fact a bug, the current behavior makes it difficult to
write a reliable userland code auditor and report proper line numbers.
Am I missing some assumptions behind the behavior of the tokenizer
function?
------------------------------------------------------------------------
--
Edit this bug report at http://bugs.php.net/?id=33093&edit=1