Re: mod_perl for multi-process file processing?

Alexandr Evstigneev Mon, 02 Feb 2015 21:06:07 -0800

Pre-loading is good, but what you need, I belive, is Storable module. If
your files contains parsed data (hashes) just store them as serialized. If
they containing raw data, need to be parsed, you may pre-parse, serialize
it and store as binary files.
Storable is written in C and works very fast.



2015-02-03 7:11 GMT+03:00 Alan Raetz <alanra...@gmail.com>:

> So I have a perl application that upon startup loads about ten perl hashes
> (some of them complex) from files. This takes up a few GB of memory and
> about 5 minutes. It then iterates through some cases and reads from (never
> writes) these perl hashes. To process all our cases, it takes about 3 hours
> (millions of cases). We would like to speed up this process. I am thinking
> this is an ideal application of mod_perl because it would allow multiple
> processes but share memory.
>
> The scheme would be to load the hashes on apache startup and have a master
> program send requests with each case and apache children will use the
> shared hashes.
>
> I just want to verify some of the details about variable sharing.  Would
> the following setup work (oversimplified, but you get the idea…):
>
> In a file Data.pm, which I would use() in my Apache startup.pl, I would
> load the perl hashes and have hash references that would be retrieved with
> class methods:
>
> package Data;
>
> my %big_hash;
>
> open(FILE,"file.txt");
>
> while ( <FILE> ) {
>
>       … code ….
>
>       $big_hash{ $key } = $value;
> }
>
> sub get_big_hashref {   return \%big_hash; }
>
> <snip>
>
> And so in the apache request handler, the code would be something like:
>
> use Data.pm;
>
> my $hashref = Data::get_big_hashref();
>
> …. code to access $hashref data with request parameters…..
>
> <snip>
>
> The idea is the HTTP request/response will contain the relevant
> input/output for each case… and the master client program will collect
> these and concatentate the final output from all the requests.
>
> So any issues/suggestions with this approach? I am facing a non-trivial
> task of refactoring the existing code to work in this framework, so just
> wanted to get some feedback before I invest more time into this...
>
> I am planning on using mod_perl 2.07 on a linux machine.
>
> Thanks in advance, Alan
>

Re: mod_perl for multi-process file processing?

Reply via email to