Robin Vickery sagde: > > The S modifier that you're using means that it's storing the studied > expression. If the regexp changes each time around the loop then over > 30000 iterations, that'll add up. See if removing that modifier helps > at all. > The S modifier wasn't needed, I added it because I thought it would speed it up but it didn't. Removing it didn't help on the memory usage, but it performs a little better without.
> If that's not it, then these *might* save you some memory, although > I've not tested them: > > I'm not entirely sure why you're matching (.*) at the end then putting > it back in with your replacement text. Without running it, I'd have > thought that you could leave out the (.*) from your pattern and the $4 > from your replacement and get exactly the same effect. > I tried removing $4 and (.*) but the result isn't the same, actually my first reg. exp. didn't have $4, but I had to add it. Without it 51 of the 1246 texts isn't processed right? Also there isn't really any difference in how it performs with or without it. > > You could use a non-capturing subpattern for $2 which you're not using > in your replacement. > > $replace = "/^((?:[a-z]+?[^a-z]+?){".($count)."})(".$typedmask.")/i"; I didn't know you could do that.. cool :), this made the script run a little faster but it still uses the same amount of memory. > > And maybe a look-behind assertion for the first subpattern rather than > a capturing pattern then re-inserting $1. > > $replace = "/^(?<=(?:[a-z]+?[^a-z]+?){".($count)."})(".$typedmask.")/i"; > $with = "<error-start sourcetext=".$corr['sourcetext']." id=".$corr['id']." > ... > With ?<= I get a lot of warnings: here is an example: $replace is '/^(?<=(?:[a-z]+?[^a-z]+?){50})(go)(.*)/i' $with is '<error-start sourcetext=3 id=49 group="-" class="-" corrected-from="go" corrected-to="god">$2<error-end sourcetext=3 id=49>$3' <br /> <b>Warning</b>: Compilation failed: lookbehind assertion is not fixed length at offset 34 with the corrections added the reg.exp. looks like this: $typedmask = preg_replace("/\s+/",".*?",$corr['typed']); $replace = '/^((?:[a-z]+?[^a-z]+?){'.($count).'})('.$typedmask.')(.*)/i'; $with = '$1<error-start sourcetext='.$corr['sourcetext'].' id='.$corr['id'].' group="'.$corr['grupper'].'" class="'.$corr['ordklasse'].'" corrected-from="'.$corr['typed'].'" corrected-to="'.$corr['corrected'].'">$2<error-end sourcetext='.$corr['sourcetext'].' id='.$corr['id'].'>$3'; $text = $skipText[0] . preg_replace ($replace,$with,$text,1); It completes a little faster and the output is exactly the same as before, but it still uses way too much memory. [EMAIL PROTECTED] testextract]# time php ../export.php > export6.txt real 1m15.851s user 0m18.720s sys 0m1.750s >From "top" just before the script completed: PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND 7843 root 17 0 269M 269M 3328 R 41.7 53.6 0:19 php This isn't a huge problem anymore, as we have been allowed to move the project to a 3 times faster server with less activity (because of this). But I would still like to know if there is a solution to this because it seems quite insane that it allocates more than 250MB memory generate 4MB output. Thanks Robin! I really appreciate your answer. Brgds Ulrik -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php