Hi Brian,

----- Original Message -----
From: "shire"
Sent: Monday, March 09, 2009

Hey Matt,

Matt Wilmas wrote:

9. tokenizer misses last single-line comment
(http://bugs.php.net/bug.php?id=46817)

I was going to take care of that one, as I mentioned in a previous
message, though it's been awhile since I've been delayed much longer
with stuff here. :-( (Nothing set up for building PHP on this system
yet; hope to in the next several hours finally, and do some things!)


Sorry I missed you're earlier email.  I saw this sitting on the 5.3 todo
list and it was breaking some of our parsing so I figured I'd take a stab
at it.  Here is my current patch
http://tekrat.com/downloads/bits/php53.scanner_eof.patch, please let me
know if you have some suggestions/changes.  It sounds like you commented
on this initially so please let me know what you/we should do ie: merging
my patch/your work, commiting this, or if you had a better fix in mind
etc. My biggest complaint is that my current patch requires adding \x00
to any exclusion rules ("[^").

These changes for handling EOF should probably be ported to the INI
scanner as well for the above reason and to keep them similar.

I don't have much time right now, but looked at it quick, and see that you're actually trying to work around the re2c issues in general. :-) I was only thinking of putting a "band-aid" on the comment symptom(s), since those are about the only ones that occur with valid code (is the tokenizer ext. *supposed to* handle all tokens in code that wouldn't really compile?). And yeah, about excluding \x00 from ANY_CHAR, it could change things, since it's always been allowed, although it seems strange that code would have literal NULLs in it (generated eval()'d code?). That was part of the reason I couldn't come up with a generic fix while keeping all behavior. If re2c would just remember the last matching state it was in at EOF like Flex!

Otherwise, I don't know what to do. :-/ I'm going to do something else before trying to implement what I was going to do, so there's no patch yet...

As far as I know there's still the other comment-related issue where no
Warning is giving about "Unterminated comment ..." for unclosed /* ...
It's all of course related to the fundamental re2c issue, for now, where
when the scanned input ends while a variable length part of a rule is
being matched, it just aborts ("return 0;") in YYFILL().

I don't seem to see this problem, perhaps I'm not reproducing it correctly?

As far as the Warning, with "<?php /* blah " do you get "Unterminated comment ..." ? Of course your patch would restore it, because it's missing last I checked (not able to right now).

And that applies to the case Lukas gave in the bug report: WHITESPACE
pattern is variable length.

Didn't see/find this is there a bug # or link?

I meant the "could be related if not the same problem" comment added the other day in Bug #46817.


-shire

- Matt

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to