Without the quotes you get it all on a single line. A 45k line can be tough on 
a regex.

/Alexander

j...@wxcvbn.org wrote:
>Denis Fondras <open...@ledeuns.net> writes:
>
>> Hello all,
>
>Hi,
>
>> This afternoon I stumbled upon a weirdness I can't explain. I hope
>some
>> misc-guru can give a clue.
>>
>> I was parsing a 45kB html document on my OpenBSD 5.3 with the help of
>> sed to extract a value and it was awfully slow. Quoting the input
>string
>> gave it a real boost :
>>
>> $ time echo "$webpage" | sed -n -r
>> 's/(.*)\"token\":\"([a-zA-Z0-9]+)\"(.*)/\2/p'
>>     0m0.19s real     0m0.00s user     0m0.00s system
>>
>> $ time echo $webpage | sed -n -r
>> 's/(.*)\"token\":\"([a-zA-Z0-9]+)\"(.*)/\2/p'>
>>     2m14.39s real     2m12.95s user     0m0.00s system
>>
>>
>> What could be the explanation ?
>
>Without the quotes the shell performs splitting, maybe ksh(1) is a bit
>slow at this...  I'd rather download the page to a temp file rather
>than
>put that stuff into memory.
>
>> Doing the same with GNU sed is instantaneous in both case
>(quoted/unquoted).
>
>Just by replacing sed by gsed, on the same system?
>
>> Thank you in advance,
>> Denis

Reply via email to