Jeremy,

>Greetings...
>
>I do something similar, pulling across 400+ log files, 3GB+ total size
every
>night.  However, I am using considerably fewer drones than you.  We found
>that we reached a saturation point long before that.  Disk I/O and network
>I/O both became bottlenecks before we reached 20 processes.

We use a satellite connection, so the upstream bandwidth is a crawl,
somewhere between 9600baud and 14.4.  The server is on 100Mb Full Duplex,
Disk I/O would be the first to go, but we're looking good ( a bunch of
striped Ultra3 Cheetahs ) so far.


>Server has 8
>processors and 16GB real, 16GB swap.  I also chose to use the Net::FTP
>module rather than continually forking off more sub-processes using the
UNIX
>ftp client.  Plus, I found that I have better control and real
>error-checking with the module.

Yep, same reasons we want to move that way - it sucks to lose all files
from a site if just ONE file fails...

>I have not seen an real memory problem inherently due to forking.
However,
>the drone process will pre-allocate the same amount of memory as the
queen.
>So, if the queen is 10MB and you create 100 drones...10x100...  The only
way
>to reduce this is to spawn the drones before pulling anything 'extra' into
>memory.

heheh  with 32GB total to work with I should hope you don't see any
problems!  :)

You have hit upon exactly my fear.  The current design is using a decent
size data structure (many nested hashes/arrays) to hold sites in the proper
"groups" and all "transactions" for each site.  My fear is that this needs
to be built BEFORE forking off children and it will then get passed to each
child, my question would be, can I immediately undef the structure in the
child to free up the memory?  I dunno.

Ideally I'd love to just maintain a connection to the database and only
select sites as needed, but if the DBI handle is open it gets copied to the
child on a fork, and when the child closes it (or dies off) it closes it in
the parent!!  So I have been designing around this by connecting to the DB,
building a structure, disconnecting, and then working off the structure.
It just seems so wasteful to have to check if I still have a handle, and
reconnect EVERY time a child dies.  I'm not sure why DBI handles do not
become separate the same way other file handles do, but in my testing they
do not.

>Not sure if any of that helps you at all, but there is my two cents.
>
>
>Jeremy Elston
>Sr. Staff Unix System Administrator
>Electronic Brokerage Technology
>Charles Schwab & Co., Inc.
>(602.431.2087 [EMAIL PROTECTED])

It is nice to see that others are doing what I'm trying, though your box is
sized quite a bit larger than my target box, it seems this should be
doable, I just have to keep looking at "best design" practices for
forking/memory usage, etc.  We'll see how it goes!

Chuck


>>I guess you could use 'top' in a unix window.
>>
>>Then kick off 1 drone only and look at the memory usage. Then * it by the
>>number of processes you expect.
>>55 is a lot of processes!!!! On Any unix system.
>>
>>Perhaps you could stagger the number of processes. so maybe spawn 30
>>drones
>>and have each drone process 2 sites in serial, one after the other?
>>
>>I've seen some pretty large perl processes on very simillar spec machines
>>as
>>yours.
>>
>>Marty

>>> -----Original Message-----
>>>
>>> I am designing a system to process almost 4000 remote sites in a
nightly
>>> sweep.  This process is controlled from a database which maintains site
>>> status in realtime (or at least that's the goal).   I am
>>> attempting to fork
>>> off around 100 "drones" to process 100 concurrent sites.  Each drone
>>>will
>>> need a connection to the database.  In doing some impromptu testing
I've
>>> had the following results...
>>>
>>> Writing a queen, who does nothing, and sets nothing (no variables or
>>> filehandles are open)  except fork off drones, and writing a
>>> drone who only
>>> connects to the database and nothing else had the following
>>> results on this
>>> machine config:
>>>
>>> RedHat Linux 6.2, PIII 600, 256MB RAM, 256MB Swap - Anything more than
>>55
>>> drones and the system is entirely out of memory.
>>>
>>> Is Perl really using that much memory?  There is approx 190MB of RAM
>>>free,
>>> and nearly ALL the swap space free when I kick off the Queen process.
>>>
>>> Do there results seem typical?  Is there any way to optimize this?
>>>
>>> Thanks
>>>
>>> Chuck


_______________________________________________
Perl-Unix-Users mailing list. To unsubscribe go to 
http://listserv.ActiveState.com/mailman/subscribe/perl-unix-users

Reply via email to