-- Guillaume Oriol <gor...@technema.fr> wrote
(on Wednesday, 22 April 2009, 06:28 AM -0700):
> I missed something when tracking this bug:
> if you look closely to the regexp, you'll see a question mark following ".*"
> in parenthesis.
> I guess this is an error as I don't understand its meaning.

In PCRE, *? is a non-greedy quantifier; what that means is "look for 0
or more of the character class, until I meet the next matching pattern".

As an example, consider the following string:

    abcdeabcde

The pattern /.*d/ would match 'abcdeabcd'; however, making it
non-greedy, /.*?d/, would match 'abcd'.

/.*?/ basically says "match 0 or more of any character. In the regexp
below, we're looking for any string potentially bounded by whitespace at
the beginning or end. The pattern is basically meaningless, as it's
looking for *any* character, only *optionally* bound by whitespace.
Clearly, non-greedy quantifiers must have some limits in PCRE, and
you're running into them.

Probably a better solution is to do this:

    if (preg_match('/^(\s+)/', $a, $matches)) {
        $a = substr($a, strlen($matches[1]));
    }
    if (preg_match('/(\s+)$/', $a, $matches)) {
        $a = substr($a, 0, strlen($a) - strlen($matches[1]));
    }

Could you open an issue in the tracker and note your issue plus the
solution, please?


> On my server:
> <pre>
> $a = str_repeat('a', 49997);
> $a = preg_replace('/^\s*(.*?)\s*$/s', '$1', $a);
> </pre>
> would return the string but:
> <pre>
> $a = str_repeat('a', 49998);
> $a = preg_replace('/^\s*(.*?)\s*$/s', '$1', $a);
> would return NULL.
> </pre>
> If I remove the question mark, preg_replace operates properly, whatever size
> the string is.
> 
> 
> Guillaume Oriol wrote:
> > 
> > Thank you Matthew for your answer but, according to PHP manual, the trim()
> > function removes ALL whitespace characters from beginning/end of the
> > string (and not only the first one). Furthermore, the trim() function
> > removes not only space but also:
> >     * "\t" (ASCII 9 (0x09))
> >     * "\n" (ASCII 10 (0x0A))
> >     * "\r" (ASCII 13 (0x0D))
> >     * "\0" (ASCII 0 (0x00))
> >     * "\x0B" (ASCII 11 (0x0B))
> > 
> > I will post a message to php-internals regarding the issue on
> > preg_replace.
> > 
> > 
> > Matthew Weier O'Phinney-3 wrote:
> >> 
> >> -- Guillaume Oriol <gor...@technema.fr> wrote
> >> (on Monday, 20 April 2009, 09:13 AM -0700):
> >>> Hi, I discovered an issue with the
> >>> javascriptCaptureStart/javascriptCaptureEnd
> >>> function pair. When the captured text exceeds a certain limit (about
> >>> 50kB in my
> >>> case), the function returns only a semi-colon. I have the following code
> >>> in a
> >>> view script:
> >>> 
> >>> <?php $this->dojo()->javascriptCaptureStart(); ?>
> >>> var data = <?php echo $this->data; ?>;
> >>> ...
> >>> <?php $this->dojo()->javascriptCaptureEnd(); ?>
> >>> 
> >>> And, as the number of rows in my database table is growing, $this->data
> >>> is
> >>> getting bigger and bigger. Finally, over ~50KB, the PHP tag returns a
> >>> semi-colon and nothing else (not even the "var data =" preceeding that
> >>> tag).
> >>> I was able to trace this issue back to the function addJavascript($js)
> >>> in
> >>> Zend_Dojo_View_Helper_Dojo_Container and more precisely to the
> >>> preg_replace
> >>> function:
> >>> 
> >>>         $js = preg_replace('/^\s*(.*?)\s*$/s', '$1', $js);
> >>> 
> >>> I replaced it by:
> >>> 
> >>>         $js = trim($js);
> >>> 
> >>> and everything was fine. Therefore, I have two questions:
> >>> - is there a know limitation on preg_replace()
> >>> - why did you use a preg_replace function to trim the string ?
> >> 
> >> I'm not aware of any limitations on preg_replace(), but you might want
> >> to either file a bug with php.net or ask on the php-internals mailing
> >> list about it -- that seems like odd behavior.
> >> 
> >> We chose to use preg_replace over trim() as it allows removing more than
> >> one whitespace character from front and back, and will include newlines
> >> when doing so.
> >> 
> >> -- 
> >> Matthew Weier O'Phinney
> >> Project Lead            | matt...@zend.com
> >> Zend Framework          | http://framework.zend.com/
> >> 
> >> 
> > 
> > 
> 
> 
> -----
> Guillaume ORIOL
> Sofware architect
> Technema
> -- 
> View this message in context: 
> http://www.nabble.com/size-limit-raised-on-javascriptCaptureStart%28%29-tp23139812p23175439.html
> Sent from the Zend Framework mailing list archive at Nabble.com.
> 

-- 
Matthew Weier O'Phinney
Project Lead            | matt...@zend.com
Zend Framework          | http://framework.zend.com/

Reply via email to