Queue Suggestions?

2003-04-05 Thread Jeff Westman
I'm posed with a problem, looking for suggestions for possible resolution.  I
have a script that has many steps in it, including telnet  ftp sessions,
database unloads, and other routines.  This script will run on a server,
accessing a remote server.  This works fine.  I will likely have several
dozen (maybe as many as 100) iterations of this script running
simultaneously.  The problem is, that their is a bottleneck towards the end
of my script -- I have to call a 3rd party process that is single-threaded. 
This means that if I have ~100 versions of my script running, I can only have
one at a time execute the 3rd party software.  It is very likely that
multiple versions will arrive at this bottle-neck junction at the same time. 
If I had more than one call the third party program, one will run, one will
loose, and die.  

So I am looking for suggestions on how I might attack this problem.  I've
thought about building some sort of external queue (like a simple hash file).
 The servers have numbers like server_01, server_02, etc.  When a iteration
of the script completes, it writes out it's server name to the file, pauses,
then checks of any other iteration is running the third party software.  If
one is running, it waits, with it's server name at the top of the file queue,
waiting.  A problem might be if again, two or more versions want to update
this queue file, so I thought maybe a random-wait period before writing to
the file-queue.

I'm open to other ideas.  (please don't suggest we rename or copy the third
party software, it just isn't possible).  I'm not looking for code, per se,
but ideas I can implement that will guarantee I will always only have one
copy of the external third party software running (including pre-checks,
queues, etc.

Thanks,

Jeff

__
Do you Yahoo!?
Yahoo! Tax Center - File online, calculators, forms, and more
http://tax.yahoo.com

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Queue Suggestions?

2003-04-05 Thread Stefan Lidman
Hello,

maybe you can use flock
perldoc -f flock

I have never used this and dont know if it works in your case.

/Stefan

 I'm posed with a problem, looking for suggestions for possible resolution.  I
 have a script that has many steps in it, including telnet  ftp sessions,
 database unloads, and other routines.  This script will run on a server,
 accessing a remote server.  This works fine.  I will likely have several
 dozen (maybe as many as 100) iterations of this script running
 simultaneously.  The problem is, that their is a bottleneck towards the end
 of my script -- I have to call a 3rd party process that is single-threaded.
 This means that if I have ~100 versions of my script running, I can only have
 one at a time execute the 3rd party software.  It is very likely that
 multiple versions will arrive at this bottle-neck junction at the same time.
 If I had more than one call the third party program, one will run, one will
 loose, and die.
 
 So I am looking for suggestions on how I might attack this problem.  I've
 thought about building some sort of external queue (like a simple hash file).
  The servers have numbers like server_01, server_02, etc.  When a iteration
 of the script completes, it writes out it's server name to the file, pauses,
 then checks of any other iteration is running the third party software.  If
 one is running, it waits, with it's server name at the top of the file queue,
 waiting.  A problem might be if again, two or more versions want to update
 this queue file, so I thought maybe a random-wait period before writing to
 the file-queue.
 
 I'm open to other ideas.  (please don't suggest we rename or copy the third
 party software, it just isn't possible).  I'm not looking for code, per se,
 but ideas I can implement that will guarantee I will always only have one
 copy of the external third party software running (including pre-checks,
 queues, etc.
 
 Thanks,
 
 Jeff

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Queue Suggestions?

2003-04-05 Thread Wiggins d'Anconia
Jeff Westman wrote:
I'm posed with a problem, looking for suggestions for possible resolution.  I
have a script that has many steps in it, including telnet  ftp sessions,
database unloads, and other routines.  This script will run on a server,
accessing a remote server.  This works fine.  I will likely have several
dozen (maybe as many as 100) iterations of this script running
simultaneously.  The problem is, that their is a bottleneck towards the end
of my script -- I have to call a 3rd party process that is single-threaded. 
This means that if I have ~100 versions of my script running, I can only have
one at a time execute the 3rd party software.  It is very likely that
multiple versions will arrive at this bottle-neck junction at the same time. 
If I had more than one call the third party program, one will run, one will
loose, and die.  

So I am looking for suggestions on how I might attack this problem.  I've
thought about building some sort of external queue (like a simple hash file).
 The servers have numbers like server_01, server_02, etc.  When a iteration
of the script completes, it writes out it's server name to the file, pauses,
then checks of any other iteration is running the third party software.  If
one is running, it waits, with it's server name at the top of the file queue,
waiting.  A problem might be if again, two or more versions want to update
this queue file, so I thought maybe a random-wait period before writing to
the file-queue.
I'm open to other ideas.  (please don't suggest we rename or copy the third
party software, it just isn't possible).  I'm not looking for code, per se,
but ideas I can implement that will guarantee I will always only have one
copy of the external third party software running (including pre-checks,
queues, etc.
Currently I am implementing a system that has similar features, 
initially we developed a set of 3 queues, one a pre-processor that 
handles many elements simultaneously, a middle queue (incidentally that 
handles external encryptions/decryptions) which are very slow (seconds 
rather than milli or micro seconds, and a final queue that handles 
sending of files, FTP/SMTP which can be very very slow (hours 
depending on FTP timeout limits...grrr I know)  For this we were 
looking for essentially an event based state machine concept, which 
(thank god) led my searching to POE (since I keep mentioning it, this is 
why):  http://poe.perl.org  After getting over the POE learning curve 
developing my queues was a snap.  Because of business decisions we have 
since moved to a 9 queue system (inbound/outbound sets, plus a post 
processing queue, plus a reroute queue (don't ask)).  Essentially a 
similar setup would work for you, where your middle queue would have a 
threshold of 1 (aka only one process at a time) whereas all of our 
stages are acceptable to have multiple versions running, but we want to 
limit the number of encryption processes happening simultaneously 
because of load rather than problems.  You may also want to have a look 
at the Event CPAN module, it provides similar but lower level functionality.

I can provide more details about the implementation of our system and 
the development of our queues if you wish, but much to my dismay I 
cannot provide source... hopefully this will get you started in any 
case, be sure to check out the example POE uses, particularly the 
multi-tasking process example.

http://danconia.org

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Queue Suggestions?

2003-04-05 Thread Rob Dixon
Jeff Westman wrote:
 I'm posed with a problem, looking for suggestions for possible resolution.  I
 have a script that has many steps in it, including telnet  ftp sessions,
 database unloads, and other routines.  This script will run on a server,
 accessing a remote server.  This works fine.  I will likely have several
 dozen (maybe as many as 100) iterations of this script running
 simultaneously.  The problem is, that their is a bottleneck towards the end
 of my script -- I have to call a 3rd party process that is single-threaded.
 This means that if I have ~100 versions of my script running, I can only have
 one at a time execute the 3rd party software.  It is very likely that
 multiple versions will arrive at this bottle-neck junction at the same time.
 If I had more than one call the third party program, one will run, one will
 loose, and die.

 So I am looking for suggestions on how I might attack this problem.  I've
 thought about building some sort of external queue (like a simple hash file).
  The servers have numbers like server_01, server_02, etc.  When a iteration
 of the script completes, it writes out it's server name to the file, pauses,
 then checks of any other iteration is running the third party software.  If
 one is running, it waits, with it's server name at the top of the file queue,
 waiting.  A problem might be if again, two or more versions want to update
 this queue file, so I thought maybe a random-wait period before writing to
 the file-queue.

 I'm open to other ideas.  (please don't suggest we rename or copy the third
 party software, it just isn't possible).  I'm not looking for code, per se,
 but ideas I can implement that will guarantee I will always only have one
 copy of the external third party software running (including pre-checks,
 queues, etc.

I don't think you need to get this complex Jeff. If your bottleneck were /at/
the end of the processing I would suggest a queue file as you describe, but
not as a means of synchronising the individual scripts. As its final stage each
script would simply append the details of its final operation to a serial file
and then exit. It would then be the job of a separate process to look at this
file periodically and execute any request which may have been written.
That will effectively serialise your operations.

However, since your process may not be able to exit straight away, what you
need, as Stefan says, is a simple dummy file lock. The following will do the
trick

use strict;
use Fcntl ':flock';

open my $que,  queue
or die Couldn't open lock file: $!;

flock $que, LOCK_EX or die Failed to lock queue: $!;
do_single_thread_op();
flock $que, LOCK_UN;

close $que;

Fcntl is there solely to add the LOCK_EX and LOCK_UN identifiers. I've opened
the file for append so that the file will be created if it isn't already there, but
will be left untouched if it is. The 'flock' call to lock exclusively will wait
indefinitely until it succeeds, which means that the process has come to the
head of the queue. It then has sole access to your third-party process and can
use it as it needs to before unlocking the file, when the next process that it
may have been holding up will be granted its lock and can continue.

I hope this helps,

Rob




-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Queue Suggestions?

2003-04-05 Thread Jeff Westman
Rob,

I think you're right.  I think the idea would be to have the server name
next-to-be-processed append to the file, then the next step call a single
separate script (start it if not already running, otherwise simpley wait)
that would lock the control file, and this script would be the single entry
point to the 3rd party software, controlling processes to run only one at a
time.  My thinking before was to have this be part of every script (last
step), but then it got real complicated thinking about queues, random wait
times and then checking, double checking, etc. 

Sometimes simpler is better.  Thanks for the suggestion!

-Jeff

___

 I don't think you need to get this complex Jeff. If your
 bottleneck were /at/ the end of the processing I would suggest a
 queue file as you describe, but not as a means of synchronising
 the individual scripts. As its final stage each script would
 simply append the details of its final operation to a serial file
 and then exit. It would then be the job of a separate process to
 look at this file periodically and execute any request which may
 have been written. That will effectively serialise your
 operations.
 
 However, since your process may not be able to exit straight
 away, what you need, as Stefan says, is a simple dummy file lock.
 The following will do the trick
 
 use strict;
 use Fcntl ':flock';
 
 open my $que,  queue
 or die Couldn't open lock file: $!;
 
 flock $que, LOCK_EX or die Failed to lock queue: $!;
 do_single_thread_op();
 flock $que, LOCK_UN;
 
 close $que;
 
 Fcntl is there solely to add the LOCK_EX and LOCK_UN identifiers.
 I've opened the file for append so that the file will be created
 if it isn't already there, but will be left untouched if it is.
 The 'flock' call to lock exclusively will wait indefinitely until
 it succeeds, which means that the process has come to the head of
 the queue. It then has sole access to your third-party process
 and can use it as it needs to before unlocking the file, when the
 next process that it may have been holding up will be granted its
 lock and can continue.
 
 I hope this helps,
 
 Rob
 

__
Do you Yahoo!?
Yahoo! Tax Center - File online, calculators, forms, and more
http://tax.yahoo.com

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]