Hey Matt,

Thanks for posting, sorry for not having a chance to reply to this sooner.  
Maybe couple things from the patch,

+/* To save initial string length after scanning to first variable, 
CG(doc_comment_len) can be reused */
+#define double_quotes_scanned_len CG(doc_comment_len)
+

(minor) Maybe we should rename this var if we're going to use it for other 
purposes, this doesn't really save any typing.  Also if we do want the define 
maybe we should upper case it so it's more obvious?

+       while (YYCURSOR < YYLIMIT) {
+               switch (*YYCURSOR++) {

In the example above, which we have a couple examples of here, we don't obey 
the YYFILL macro to detect if we have exceeded our EOF.  This *might* be a 
problem, but only really depends on if we intend to use the YYFILL as a 
solution for exceeding our mmap bounds.

Regarding the ZEND_MMAP_AHEAD issue and the temp. fix that Dmitry put in we 
need to find a solution to that, perhaps I can play with that this week too as 
I think I'm seeing some related issues in my testing of 5.3.  Essentially we 
abuse ZEND_MMAP_AHEAD by adding it to the file size and passing it to the mmap 
call which isn't at all valid and only really works up to PAGESIZE.  We could 
possibly use YYFILL to re-allocate more space as necessary past the end of file 
to fix this.

I don't see anything glaring in the patch that's a major issue, I can probably 
test more on a larger code base in the next 2-3 days.  As I've said before this 
seems to be crossing the line of us writing a scanner by hand rather than 
letting re2c do the heavy lifting, but without a modification to re2c to handle 
EOF I don't have an alternative solution currently.  (If we had some way to 
detect which regex we where matching against in the YYFILL that would likely be 
able to handle these bugs, but I didn't see a way to do that easily).

-shire

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to