Queue Suggestions?
I'm posed with a problem, looking for suggestions for possible resolution. I have a script that has many steps in it, including telnet ftp sessions, database unloads, and other routines. This script will run on a server, accessing a remote server. This works fine. I will likely have several dozen (maybe as many as 100) iterations of this script running simultaneously. The problem is, that their is a bottleneck towards the end of my script -- I have to call a 3rd party process that is single-threaded. This means that if I have ~100 versions of my script running, I can only have one at a time execute the 3rd party software. It is very likely that multiple versions will arrive at this bottle-neck junction at the same time. If I had more than one call the third party program, one will run, one will loose, and die. So I am looking for suggestions on how I might attack this problem. I've thought about building some sort of external queue (like a simple hash file). The servers have numbers like server_01, server_02, etc. When a iteration of the script completes, it writes out it's server name to the file, pauses, then checks of any other iteration is running the third party software. If one is running, it waits, with it's server name at the top of the file queue, waiting. A problem might be if again, two or more versions want to update this queue file, so I thought maybe a random-wait period before writing to the file-queue. I'm open to other ideas. (please don't suggest we rename or copy the third party software, it just isn't possible). I'm not looking for code, per se, but ideas I can implement that will guarantee I will always only have one copy of the external third party software running (including pre-checks, queues, etc. Thanks, Jeff __ Do you Yahoo!? Yahoo! Tax Center - File online, calculators, forms, and more http://tax.yahoo.com -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Queue Suggestions?
Hello, maybe you can use flock perldoc -f flock I have never used this and dont know if it works in your case. /Stefan I'm posed with a problem, looking for suggestions for possible resolution. I have a script that has many steps in it, including telnet ftp sessions, database unloads, and other routines. This script will run on a server, accessing a remote server. This works fine. I will likely have several dozen (maybe as many as 100) iterations of this script running simultaneously. The problem is, that their is a bottleneck towards the end of my script -- I have to call a 3rd party process that is single-threaded. This means that if I have ~100 versions of my script running, I can only have one at a time execute the 3rd party software. It is very likely that multiple versions will arrive at this bottle-neck junction at the same time. If I had more than one call the third party program, one will run, one will loose, and die. So I am looking for suggestions on how I might attack this problem. I've thought about building some sort of external queue (like a simple hash file). The servers have numbers like server_01, server_02, etc. When a iteration of the script completes, it writes out it's server name to the file, pauses, then checks of any other iteration is running the third party software. If one is running, it waits, with it's server name at the top of the file queue, waiting. A problem might be if again, two or more versions want to update this queue file, so I thought maybe a random-wait period before writing to the file-queue. I'm open to other ideas. (please don't suggest we rename or copy the third party software, it just isn't possible). I'm not looking for code, per se, but ideas I can implement that will guarantee I will always only have one copy of the external third party software running (including pre-checks, queues, etc. Thanks, Jeff -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Queue Suggestions?
Jeff Westman wrote: I'm posed with a problem, looking for suggestions for possible resolution. I have a script that has many steps in it, including telnet ftp sessions, database unloads, and other routines. This script will run on a server, accessing a remote server. This works fine. I will likely have several dozen (maybe as many as 100) iterations of this script running simultaneously. The problem is, that their is a bottleneck towards the end of my script -- I have to call a 3rd party process that is single-threaded. This means that if I have ~100 versions of my script running, I can only have one at a time execute the 3rd party software. It is very likely that multiple versions will arrive at this bottle-neck junction at the same time. If I had more than one call the third party program, one will run, one will loose, and die. So I am looking for suggestions on how I might attack this problem. I've thought about building some sort of external queue (like a simple hash file). The servers have numbers like server_01, server_02, etc. When a iteration of the script completes, it writes out it's server name to the file, pauses, then checks of any other iteration is running the third party software. If one is running, it waits, with it's server name at the top of the file queue, waiting. A problem might be if again, two or more versions want to update this queue file, so I thought maybe a random-wait period before writing to the file-queue. I'm open to other ideas. (please don't suggest we rename or copy the third party software, it just isn't possible). I'm not looking for code, per se, but ideas I can implement that will guarantee I will always only have one copy of the external third party software running (including pre-checks, queues, etc. Currently I am implementing a system that has similar features, initially we developed a set of 3 queues, one a pre-processor that handles many elements simultaneously, a middle queue (incidentally that handles external encryptions/decryptions) which are very slow (seconds rather than milli or micro seconds, and a final queue that handles sending of files, FTP/SMTP which can be very very slow (hours depending on FTP timeout limits...grrr I know) For this we were looking for essentially an event based state machine concept, which (thank god) led my searching to POE (since I keep mentioning it, this is why): http://poe.perl.org After getting over the POE learning curve developing my queues was a snap. Because of business decisions we have since moved to a 9 queue system (inbound/outbound sets, plus a post processing queue, plus a reroute queue (don't ask)). Essentially a similar setup would work for you, where your middle queue would have a threshold of 1 (aka only one process at a time) whereas all of our stages are acceptable to have multiple versions running, but we want to limit the number of encryption processes happening simultaneously because of load rather than problems. You may also want to have a look at the Event CPAN module, it provides similar but lower level functionality. I can provide more details about the implementation of our system and the development of our queues if you wish, but much to my dismay I cannot provide source... hopefully this will get you started in any case, be sure to check out the example POE uses, particularly the multi-tasking process example. http://danconia.org -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Queue Suggestions?
Jeff Westman wrote: I'm posed with a problem, looking for suggestions for possible resolution. I have a script that has many steps in it, including telnet ftp sessions, database unloads, and other routines. This script will run on a server, accessing a remote server. This works fine. I will likely have several dozen (maybe as many as 100) iterations of this script running simultaneously. The problem is, that their is a bottleneck towards the end of my script -- I have to call a 3rd party process that is single-threaded. This means that if I have ~100 versions of my script running, I can only have one at a time execute the 3rd party software. It is very likely that multiple versions will arrive at this bottle-neck junction at the same time. If I had more than one call the third party program, one will run, one will loose, and die. So I am looking for suggestions on how I might attack this problem. I've thought about building some sort of external queue (like a simple hash file). The servers have numbers like server_01, server_02, etc. When a iteration of the script completes, it writes out it's server name to the file, pauses, then checks of any other iteration is running the third party software. If one is running, it waits, with it's server name at the top of the file queue, waiting. A problem might be if again, two or more versions want to update this queue file, so I thought maybe a random-wait period before writing to the file-queue. I'm open to other ideas. (please don't suggest we rename or copy the third party software, it just isn't possible). I'm not looking for code, per se, but ideas I can implement that will guarantee I will always only have one copy of the external third party software running (including pre-checks, queues, etc. I don't think you need to get this complex Jeff. If your bottleneck were /at/ the end of the processing I would suggest a queue file as you describe, but not as a means of synchronising the individual scripts. As its final stage each script would simply append the details of its final operation to a serial file and then exit. It would then be the job of a separate process to look at this file periodically and execute any request which may have been written. That will effectively serialise your operations. However, since your process may not be able to exit straight away, what you need, as Stefan says, is a simple dummy file lock. The following will do the trick use strict; use Fcntl ':flock'; open my $que, queue or die Couldn't open lock file: $!; flock $que, LOCK_EX or die Failed to lock queue: $!; do_single_thread_op(); flock $que, LOCK_UN; close $que; Fcntl is there solely to add the LOCK_EX and LOCK_UN identifiers. I've opened the file for append so that the file will be created if it isn't already there, but will be left untouched if it is. The 'flock' call to lock exclusively will wait indefinitely until it succeeds, which means that the process has come to the head of the queue. It then has sole access to your third-party process and can use it as it needs to before unlocking the file, when the next process that it may have been holding up will be granted its lock and can continue. I hope this helps, Rob -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Queue Suggestions?
Rob, I think you're right. I think the idea would be to have the server name next-to-be-processed append to the file, then the next step call a single separate script (start it if not already running, otherwise simpley wait) that would lock the control file, and this script would be the single entry point to the 3rd party software, controlling processes to run only one at a time. My thinking before was to have this be part of every script (last step), but then it got real complicated thinking about queues, random wait times and then checking, double checking, etc. Sometimes simpler is better. Thanks for the suggestion! -Jeff ___ I don't think you need to get this complex Jeff. If your bottleneck were /at/ the end of the processing I would suggest a queue file as you describe, but not as a means of synchronising the individual scripts. As its final stage each script would simply append the details of its final operation to a serial file and then exit. It would then be the job of a separate process to look at this file periodically and execute any request which may have been written. That will effectively serialise your operations. However, since your process may not be able to exit straight away, what you need, as Stefan says, is a simple dummy file lock. The following will do the trick use strict; use Fcntl ':flock'; open my $que, queue or die Couldn't open lock file: $!; flock $que, LOCK_EX or die Failed to lock queue: $!; do_single_thread_op(); flock $que, LOCK_UN; close $que; Fcntl is there solely to add the LOCK_EX and LOCK_UN identifiers. I've opened the file for append so that the file will be created if it isn't already there, but will be left untouched if it is. The 'flock' call to lock exclusively will wait indefinitely until it succeeds, which means that the process has come to the head of the queue. It then has sole access to your third-party process and can use it as it needs to before unlocking the file, when the next process that it may have been holding up will be granted its lock and can continue. I hope this helps, Rob __ Do you Yahoo!? Yahoo! Tax Center - File online, calculators, forms, and more http://tax.yahoo.com -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]