Re: processing at preset times

2005-11-08 Thread Bob Showalter

Frank Bax wrote:
I have script that takes a very long time to run - hours, sometimes even 
days (even on a P2-2.8Ghz machine.  After loading some data from a 
database at the beginning (less than a second), the script does no i/o 
until results are output at the end of script.  I'd like to know how the 
script is progressing through its data, so I added some code to update 
the database at regular data intervals, but this has some problems:
- database is remote, so script goes much slower with these status 

- updates are based on data instead of clock.

I read something about threads in perl and was wondering if these status 
updates should be coded inside a thread so they have less impact on 
overall script performance.  The number crunching could still go on 
while database update happens in separate thread.

Well, threads don't magically turn one CPU into two. However, this 
particular case sounds like a good application for threads. Your main 
crunching thread can be running along while the database updating thread 
is blocked waiting for the database to respond.

To get updates based on clock instead of data...  Are there tools within 
perl for using clock/timer information?  Do I have to parse clock/timer

info myself to make something happen every hour inside an existing loop?

You can just use the simple built-in time() function, which returns 
seconds since the epoch. If you want to update the database once per 
hour, you could do something like this:

   my $t = time;
   while (1) {
   $secs = time - ($t + 3600);  # start of next update
   sleep $secs if $secs > 0;

The tricky part is sharing data between the threads, which is extremely 
sucky in Perl's ithreads implementation, IMO. Read perlthrtut for an 
overview of the issues.

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: processing at preset times

2005-11-08 Thread Jay Savage
On 11/8/05, Frank Bax <[EMAIL PROTECTED]> wrote:
> I have script that takes a very long time to run - hours, sometimes even
> days (even on a P2-2.8Ghz machine.  After loading some data from a database
> at the beginning (less than a second), the script does no i/o until results
> are output at the end of script.  I'd like to know how the script is
> progressing through its data, so I added some code to update the database
> at regular data intervals, but this has some problems:
> - database is remote, so script goes much slower with these status 
> updates.
> - updates are based on data instead of clock.
> I read something about threads in perl and was wondering if these status
> updates should be coded inside a thread so they have less impact on overall
> script performance.  The number crunching could still go on while database
> update happens in separate thread.
> To get updates based on clock instead of data...  Are there tools within
> perl for using clock/timer information?  Do I have to parse clock/timer
> info myself to make something happen every hour inside an existing loop?
> I realise that my subject line might suggest use of cron, but this is not
> workable unless there is some way for two scripts to communicate with each
> other.  If this could work, the processing script would probably still need
> a thread to do communication with timer script anyway.
> Frank


See perldoc -f alarm and perlipc for details, but the normal idom for
this sort of thing is to trap SIGALRM. A little pseudocode:

my $timeout = 3600; # 1 hour

while ( @your_data ) {
eval {
local $SIG{ALRM} = sub {
# do something to save state
die "alarm\n" }; # NB: \n required

alarm $timeout;

while ( @your_data ) {
my $data = shift @your_data;
# process your data, calling a subroutine
# for the heavy lifting makes sense
alarm 0; # unset the alarm after you finish the last pass

if ($@ && $@ eq "alarm\n" ) {
# timed out
your_log_sub();  # perform your logging here
# e.g. push interrupted procedure back onto stack
} elsif ($@) {
# do something about other errors

As for how you do your logging, it's certainly simplest to just attach
to the database and log as you go along. It's hard to believe that the
time taken for a database connection every couple of hours would be
that inhibiting on a process that runs for days.

If it really is a big deal, though, your log subroutine can use fork()
to spawn a subprocess to log. See perldoc -f fork and perlipc for

Threading probably isn't what you want here, since it will, among
other things, copy the entire data structure at the time the thread is
created. you could create a logging thread early in your program
before you load your data set, but in my mind at least that seems like
overkill for simple logging for a non-daemon process. perlthrtut is a
good place to start learning about threads.


-- jay
This email and attachment(s): [  ] blogable; [ x ] ask first; [  ]
private and confidential

daggerquill [at] gmail [dot] com

values of β will give rise to dom!

Re: processing at preset times

2005-11-08 Thread Shawn Corey

Frank Bax wrote:
I realise that my subject line might suggest use of cron, but this is 
not workable unless there is some way for two scripts to communicate 
with each other.  If this could work, the processing script would 
probably still need a thread to do communication with timer script anyway.

Actually, using cron may not be a bad idea. There is nothing like having 
a 100 hour job crash after 99 hours and 59 minutes. If you can, consider 
rewriting the program so it works in small chunks. Use cron to restart 
the program at regular intervals. You can get the status of the process 
by examining the state of the temporary files, using another cron job. 
Even if you have a power failure, the process will be restarted when the 
computer reboots. An added bonus is you don't have to login in the wee 
hours of the morning just to make sure the program is still running.


Just my 0.0002 million dollars worth,
   --- Shawn

"Probability is now one. Any problems that are left are your own."
   SS Heart of Gold, _The Hitchhiker's Guide to the Galaxy_

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: processing at preset times

2005-11-08 Thread Ryan Frantz

> -Original Message-
> From: Shawn Corey [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, November 08, 2005 10:59 AM
> To:
> Subject: Re: processing at preset times
> Frank Bax wrote:
> > I realise that my subject line might suggest use of cron, but this
> > not workable unless there is some way for two scripts to communicate
> > with each other.  If this could work, the processing script would
> > probably still need a thread to do communication with timer script
> anyway.
> Actually, using cron may not be a bad idea. There is nothing like
> a 100 hour job crash after 99 hours and 59 minutes. If you can,
> rewriting the program so it works in small chunks. Use cron to restart
> the program at regular intervals. You can get the status of the

You may also want to consider init for process respawning.  You can also
set the runlevels that the script will execute (in the event that you
need it to run at lower/higher runlevels across reboots).

> by examining the state of the temporary files, using another cron job.
> Even if you have a power failure, the process will be restarted when
> computer reboots. An added bonus is you don't have to login in the wee
> hours of the morning just to make sure the program is still running.
> --
> Just my 0.0002 million dollars worth,
> --- Shawn
> "Probability is now one. Any problems that are left are your own."
> SS Heart of Gold, _The Hitchhiker's Guide to the Galaxy_
> --
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> <> <>

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<> <>

Re: processing at preset times

2005-11-08 Thread John W. Krahn
Frank Bax wrote:
> I have script that takes a very long time to run - hours, sometimes even
> days (even on a P2-2.8Ghz machine.  After loading some data from a
> database at the beginning (less than a second), the script does no i/o
> until results are output at the end of script.  I'd like to know how the
> script is progressing through its data, so I added some code to update
> the database at regular data intervals, but this has some problems:
> - database is remote, so script goes much slower with these status
> updates.
> - updates are based on data instead of clock.
> I read something about threads in perl and was wondering if these status
> updates should be coded inside a thread so they have less impact on
> overall script performance.  The number crunching could still go on
> while database update happens in separate thread.
> To get updates based on clock instead of data...  Are there tools within
> perl for using clock/timer information?

perldoc Devel::DProf
perldoc Time::HiRes

use Perl;

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]