hi,

sorry for this OT, but i have tried elsewhere. this really is a
general perl question ...

i have a spidering script that pulls links/urls from some of our sites
using LWP. [this script is in fact a modification of the spider
written ny sean burke in 'perl & lwp']

while doing this i store the urls in hashes like this:

$ok_in_site{$url}      = $response;
$not_ok_in_site{$url}  = $response;
$ok_out_site{$url}     = $response;
$not_ok_out_site{$url} = $response;

$url = up to 255 characters
$response = about 15 characters

i do really need to store those values somehow, because eventually
these need to be sorted for their final html-output

all this is behaving very well and fast on sites with few links (less
than 2000), but when i run the script on a large site (more than 8000
unique links) i can see the memory is increasing dramtically, so much
that i can't even stop the program normally. [a couple of control-c's
will do]

i am not sure whether the problem is in the hash-storing itself or the
post-sorting. but if i monitor the memory comsumption while pulling
the links/urls i can clearly see it does increase steadily

so what i'm looking for is some advice. i am thinking off these
alternatives for the above hashes:

a) store the urls / responses in a temp txtfile 
b) store the urls / responses in a temp db-table
c) someone with a better idea

a) sounds even more cnsuming to read
b) sounds less flexible across systems
c) ?


really hope someone can help, as it's pretty hard to test.
../allan

Reply via email to