Robin Vickery sagde:
>
> The S modifier that you're using means that it's storing the studied
> expression. If the regexp changes each time around the loop then over
> 30000 iterations, that'll add up. See if removing that modifier helps
> at all.
>
The S modifier wasn't needed, I added it because I thought it would speed it up but
it didn't. Removing it didn't help on the memory usage, but it performs a little
better without.
> If that's not it, then these *might* save you some memory, although
> I've not tested them:
>
> I'm not entirely sure why you're matching (.*) at the end then putting
> it back in with your replacement text. Without running it, I'd have
> thought that you could leave out the (.*) from your pattern and the $4
> from your replacement and get exactly the same effect.
>
I tried removing $4 and (.*) but the result isn't the same, actually my first reg.
exp. didn't have $4, but I had to add it. Without it 51 of the 1246 texts isn't
processed right? Also there isn't really any difference in how it performs with or
without it.
>
> You could use a non-capturing subpattern for $2 which you're not using
> in your replacement.
>
> $replace = "/^((?:[a-z]+?[^a-z]+?){".($count)."})(".$typedmask.")/i";
I didn't know you could do that.. cool :), this made the script run a little faster
but it still uses the same amount of memory.
>
> And maybe a look-behind assertion for the first subpattern rather than
> a capturing pattern then re-inserting $1.
>
> $replace = "/^(?<=(?:[a-z]+?[^a-z]+?){".($count)."})(".$typedmask.")/i";
> $with = "<error-start sourcetext=".$corr['sourcetext']." id=".$corr['id']."
> ...
>
With ?<= I get a lot of warnings:
here is an example:
$replace is '/^(?<=(?:[a-z]+?[^a-z]+?){50})(go)(.*)/i'
$with is '<error-start sourcetext=3 id=49 group="-" class="-" corrected-from="go"
corrected-to="god">$2<error-end sourcetext=3 id=49>$3'
<br />
<b>Warning</b>: Compilation failed: lookbehind assertion is not fixed length at
offset 34
with the corrections added the reg.exp. looks like this:
$typedmask = preg_replace("/\s+/",".*?",$corr['typed']);
$replace = '/^((?:[a-z]+?[^a-z]+?){'.($count).'})('.$typedmask.')(.*)/i';
$with = '$1<error-start sourcetext='.$corr['sourcetext'].' id='.$corr['id'].'
group="'.$corr['grupper'].'" class="'.$corr['ordklasse'].'"
corrected-from="'.$corr['typed'].'"
corrected-to="'.$corr['corrected'].'">$2<error-end
sourcetext='.$corr['sourcetext'].' id='.$corr['id'].'>$3';
$text = $skipText[0] . preg_replace ($replace,$with,$text,1);
It completes a little faster and the output is exactly the same as before,
but it still uses way too much memory.
[EMAIL PROTECTED] testextract]# time php ../export.php > export6.txt
real 1m15.851s
user 0m18.720s
sys 0m1.750s
>From "top" just before the script completed:
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
7843 root 17 0 269M 269M 3328 R 41.7 53.6 0:19 php
This isn't a huge problem anymore, as we have been allowed to move the project to a
3 times faster server with less activity (because of this).
But I would still like to know if there is a solution to this because it seems quite
insane that it allocates more than 250MB memory generate 4MB output.
Thanks Robin! I really appreciate your answer.
Brgds Ulrik
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php