processing at preset times
I have script that takes a very long time to run - hours, sometimes even days (even on a P2-2.8Ghz machine. After loading some data from a database at the beginning (less than a second), the script does no i/o until results are output at the end of script. I'd like to know how the script is progressing through its data, so I added some code to update the database at regular data intervals, but this has some problems: - database is remote, so script goes much slower with these status updates. - updates are based on data instead of clock. I read something about threads in perl and was wondering if these status updates should be coded inside a thread so they have less impact on overall script performance. The number crunching could still go on while database update happens in separate thread. To get updates based on clock instead of data... Are there tools within perl for using clock/timer information? Do I have to parse clock/timer info myself to make something happen every hour inside an existing loop? I realise that my subject line might suggest use of cron, but this is not workable unless there is some way for two scripts to communicate with each other. If this could work, the processing script would probably still need a thread to do communication with timer script anyway. Frank -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: processing at preset times
Frank Bax wrote: I have script that takes a very long time to run - hours, sometimes even days (even on a P2-2.8Ghz machine. After loading some data from a database at the beginning (less than a second), the script does no i/o until results are output at the end of script. I'd like to know how the script is progressing through its data, so I added some code to update the database at regular data intervals, but this has some problems: - database is remote, so script goes much slower with these status updates. - updates are based on data instead of clock. I read something about threads in perl and was wondering if these status updates should be coded inside a thread so they have less impact on overall script performance. The number crunching could still go on while database update happens in separate thread. Well, threads don't magically turn one CPU into two. However, this particular case sounds like a good application for threads. Your main crunching thread can be running along while the database updating thread is blocked waiting for the database to respond. To get updates based on clock instead of data... Are there tools within perl for using clock/timer information? Do I have to parse clock/timer info myself to make something happen every hour inside an existing loop? You can just use the simple built-in time() function, which returns seconds since the epoch. If you want to update the database once per hour, you could do something like this: my $t = time; while (1) { do_database_update(); $secs = time - ($t + 3600); # start of next update sleep $secs if $secs 0; } The tricky part is sharing data between the threads, which is extremely sucky in Perl's ithreads implementation, IMO. Read perlthrtut for an overview of the issues. -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: processing at preset times
On 11/8/05, Frank Bax [EMAIL PROTECTED] wrote: I have script that takes a very long time to run - hours, sometimes even days (even on a P2-2.8Ghz machine. After loading some data from a database at the beginning (less than a second), the script does no i/o until results are output at the end of script. I'd like to know how the script is progressing through its data, so I added some code to update the database at regular data intervals, but this has some problems: - database is remote, so script goes much slower with these status updates. - updates are based on data instead of clock. I read something about threads in perl and was wondering if these status updates should be coded inside a thread so they have less impact on overall script performance. The number crunching could still go on while database update happens in separate thread. To get updates based on clock instead of data... Are there tools within perl for using clock/timer information? Do I have to parse clock/timer info myself to make something happen every hour inside an existing loop? I realise that my subject line might suggest use of cron, but this is not workable unless there is some way for two scripts to communicate with each other. If this could work, the processing script would probably still need a thread to do communication with timer script anyway. Frank Frank, See perldoc -f alarm and perlipc for details, but the normal idom for this sort of thing is to trap SIGALRM. A little pseudocode: my $timeout = 3600; # 1 hour while ( @your_data ) { eval { local $SIG{ALRM} = sub { # do something to save state die alarm\n }; # NB: \n required alarm $timeout; while ( @your_data ) { my $data = shift @your_data; # process your data, calling a subroutine # for the heavy lifting makes sense } alarm 0; # unset the alarm after you finish the last pass }; if ($@ $@ eq alarm\n ) { # timed out your_log_sub(); # perform your logging here recover_state(); # e.g. push interrupted procedure back onto stack } elsif ($@) { # do something about other errors } } As for how you do your logging, it's certainly simplest to just attach to the database and log as you go along. It's hard to believe that the time taken for a database connection every couple of hours would be that inhibiting on a process that runs for days. If it really is a big deal, though, your log subroutine can use fork() to spawn a subprocess to log. See perldoc -f fork and perlipc for details. Threading probably isn't what you want here, since it will, among other things, copy the entire data structure at the time the thread is created. you could create a logging thread early in your program before you load your data set, but in my mind at least that seems like overkill for simple logging for a non-daemon process. perlthrtut is a good place to start learning about threads. HTH -- jay -- This email and attachment(s): [ ] blogable; [ x ] ask first; [ ] private and confidential daggerquill [at] gmail [dot] com http://www.tuaw.com http://www.dpguru.com http://www.engatiki.org values of β will give rise to dom!
Re: processing at preset times
Frank Bax wrote: I realise that my subject line might suggest use of cron, but this is not workable unless there is some way for two scripts to communicate with each other. If this could work, the processing script would probably still need a thread to do communication with timer script anyway. Actually, using cron may not be a bad idea. There is nothing like having a 100 hour job crash after 99 hours and 59 minutes. If you can, consider rewriting the program so it works in small chunks. Use cron to restart the program at regular intervals. You can get the status of the process by examining the state of the temporary files, using another cron job. Even if you have a power failure, the process will be restarted when the computer reboots. An added bonus is you don't have to login in the wee hours of the morning just to make sure the program is still running. -- Just my 0.0002 million dollars worth, --- Shawn Probability is now one. Any problems that are left are your own. SS Heart of Gold, _The Hitchhiker's Guide to the Galaxy_ -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
RE: processing at preset times
-Original Message- From: Shawn Corey [mailto:[EMAIL PROTECTED] Sent: Tuesday, November 08, 2005 10:59 AM To: beginners@perl.org Subject: Re: processing at preset times Frank Bax wrote: I realise that my subject line might suggest use of cron, but this is not workable unless there is some way for two scripts to communicate with each other. If this could work, the processing script would probably still need a thread to do communication with timer script anyway. Actually, using cron may not be a bad idea. There is nothing like having a 100 hour job crash after 99 hours and 59 minutes. If you can, consider rewriting the program so it works in small chunks. Use cron to restart the program at regular intervals. You can get the status of the process You may also want to consider init for process respawning. You can also set the runlevels that the script will execute (in the event that you need it to run at lower/higher runlevels across reboots). by examining the state of the temporary files, using another cron job. Even if you have a power failure, the process will be restarted when the computer reboots. An added bonus is you don't have to login in the wee hours of the morning just to make sure the program is still running. -- Just my 0.0002 million dollars worth, --- Shawn Probability is now one. Any problems that are left are your own. SS Heart of Gold, _The Hitchhiker's Guide to the Galaxy_ -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: processing at preset times
Frank Bax wrote: I have script that takes a very long time to run - hours, sometimes even days (even on a P2-2.8Ghz machine. After loading some data from a database at the beginning (less than a second), the script does no i/o until results are output at the end of script. I'd like to know how the script is progressing through its data, so I added some code to update the database at regular data intervals, but this has some problems: - database is remote, so script goes much slower with these status updates. - updates are based on data instead of clock. I read something about threads in perl and was wondering if these status updates should be coded inside a thread so they have less impact on overall script performance. The number crunching could still go on while database update happens in separate thread. To get updates based on clock instead of data... Are there tools within perl for using clock/timer information? perldoc Devel::DProf perldoc Time::HiRes John -- use Perl; program fulfillment -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response