processing at preset times

2005-11-08 Thread Frank Bax
I have script that takes a very long time to run - hours, sometimes even 
days (even on a P2-2.8Ghz machine.  After loading some data from a database 
at the beginning (less than a second), the script does no i/o until results 
are output at the end of script.  I'd like to know how the script is 
progressing through its data, so I added some code to update the database 
at regular data intervals, but this has some problems:

- database is remote, so script goes much slower with these status 
updates.
- updates are based on data instead of clock.

I read something about threads in perl and was wondering if these status 
updates should be coded inside a thread so they have less impact on overall 
script performance.  The number crunching could still go on while database 
update happens in separate thread.


To get updates based on clock instead of data...  Are there tools within 
perl for using clock/timer information?  Do I have to parse clock/timer 
info myself to make something happen every hour inside an existing loop?


I realise that my subject line might suggest use of cron, but this is not 
workable unless there is some way for two scripts to communicate with each 
other.  If this could work, the processing script would probably still need 
a thread to do communication with timer script anyway.


Frank


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/ http://learn.perl.org/first-response




Re: processing at preset times

2005-11-08 Thread Bob Showalter

Frank Bax wrote:
I have script that takes a very long time to run - hours, sometimes even 
days (even on a P2-2.8Ghz machine.  After loading some data from a 
database at the beginning (less than a second), the script does no i/o 
until results are output at the end of script.  I'd like to know how the 
script is progressing through its data, so I added some code to update 
the database at regular data intervals, but this has some problems:
- database is remote, so script goes much slower with these status 
updates.

- updates are based on data instead of clock.

I read something about threads in perl and was wondering if these status 
updates should be coded inside a thread so they have less impact on 
overall script performance.  The number crunching could still go on 
while database update happens in separate thread.


Well, threads don't magically turn one CPU into two. However, this 
particular case sounds like a good application for threads. Your main 
crunching thread can be running along while the database updating thread 
is blocked waiting for the database to respond.




To get updates based on clock instead of data...  Are there tools within 
perl for using clock/timer information?  Do I have to parse clock/timer

info myself to make something happen every hour inside an existing loop?


You can just use the simple built-in time() function, which returns 
seconds since the epoch. If you want to update the database once per 
hour, you could do something like this:


   my $t = time;
   while (1) {
   do_database_update();
   $secs = time - ($t + 3600);  # start of next update
   sleep $secs if $secs  0;
   }

The tricky part is sharing data between the threads, which is extremely 
sucky in Perl's ithreads implementation, IMO. Read perlthrtut for an 
overview of the issues.


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/ http://learn.perl.org/first-response




Re: processing at preset times

2005-11-08 Thread Jay Savage
On 11/8/05, Frank Bax [EMAIL PROTECTED] wrote:
 I have script that takes a very long time to run - hours, sometimes even
 days (even on a P2-2.8Ghz machine.  After loading some data from a database
 at the beginning (less than a second), the script does no i/o until results
 are output at the end of script.  I'd like to know how the script is
 progressing through its data, so I added some code to update the database
 at regular data intervals, but this has some problems:
 - database is remote, so script goes much slower with these status 
 updates.
 - updates are based on data instead of clock.

 I read something about threads in perl and was wondering if these status
 updates should be coded inside a thread so they have less impact on overall
 script performance.  The number crunching could still go on while database
 update happens in separate thread.

 To get updates based on clock instead of data...  Are there tools within
 perl for using clock/timer information?  Do I have to parse clock/timer
 info myself to make something happen every hour inside an existing loop?

 I realise that my subject line might suggest use of cron, but this is not
 workable unless there is some way for two scripts to communicate with each
 other.  If this could work, the processing script would probably still need
 a thread to do communication with timer script anyway.

 Frank

Frank,

See perldoc -f alarm and perlipc for details, but the normal idom for
this sort of thing is to trap SIGALRM. A little pseudocode:

my $timeout = 3600; # 1 hour

while ( @your_data ) {
eval {
local $SIG{ALRM} = sub {
# do something to save state
die alarm\n }; # NB: \n required

alarm $timeout;

while ( @your_data ) {
my $data = shift @your_data;
# process your data, calling a subroutine
# for the heavy lifting makes sense
}
alarm 0; # unset the alarm after you finish the last pass
};

if ($@  $@ eq alarm\n ) {
# timed out
your_log_sub();  # perform your logging here
recover_state();
# e.g. push interrupted procedure back onto stack
} elsif ($@) {
# do something about other errors
}
}

As for how you do your logging, it's certainly simplest to just attach
to the database and log as you go along. It's hard to believe that the
time taken for a database connection every couple of hours would be
that inhibiting on a process that runs for days.

If it really is a big deal, though, your log subroutine can use fork()
to spawn a subprocess to log. See perldoc -f fork and perlipc for
details.

Threading probably isn't what you want here, since it will, among
other things, copy the entire data structure at the time the thread is
created. you could create a logging thread early in your program
before you load your data set, but in my mind at least that seems like
overkill for simple logging for a non-daemon process. perlthrtut is a
good place to start learning about threads.

HTH

-- jay
--
This email and attachment(s): [  ] blogable; [ x ] ask first; [  ]
private and confidential

daggerquill [at] gmail [dot] com
http://www.tuaw.com  http://www.dpguru.com  http://www.engatiki.org

values of β will give rise to dom!


Re: processing at preset times

2005-11-08 Thread Shawn Corey

Frank Bax wrote:
I realise that my subject line might suggest use of cron, but this is 
not workable unless there is some way for two scripts to communicate 
with each other.  If this could work, the processing script would 
probably still need a thread to do communication with timer script anyway.


Actually, using cron may not be a bad idea. There is nothing like having 
a 100 hour job crash after 99 hours and 59 minutes. If you can, consider 
rewriting the program so it works in small chunks. Use cron to restart 
the program at regular intervals. You can get the status of the process 
by examining the state of the temporary files, using another cron job. 
Even if you have a power failure, the process will be restarted when the 
computer reboots. An added bonus is you don't have to login in the wee 
hours of the morning just to make sure the program is still running.


--

Just my 0.0002 million dollars worth,
   --- Shawn

Probability is now one. Any problems that are left are your own.
   SS Heart of Gold, _The Hitchhiker's Guide to the Galaxy_

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/ http://learn.perl.org/first-response




RE: processing at preset times

2005-11-08 Thread Ryan Frantz


 -Original Message-
 From: Shawn Corey [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, November 08, 2005 10:59 AM
 To: beginners@perl.org
 Subject: Re: processing at preset times
 
 Frank Bax wrote:
  I realise that my subject line might suggest use of cron, but this
is
  not workable unless there is some way for two scripts to communicate
  with each other.  If this could work, the processing script would
  probably still need a thread to do communication with timer script
 anyway.
 
 Actually, using cron may not be a bad idea. There is nothing like
having
 a 100 hour job crash after 99 hours and 59 minutes. If you can,
consider
 rewriting the program so it works in small chunks. Use cron to restart
 the program at regular intervals. You can get the status of the
process

You may also want to consider init for process respawning.  You can also
set the runlevels that the script will execute (in the event that you
need it to run at lower/higher runlevels across reboots).

 by examining the state of the temporary files, using another cron job.
 Even if you have a power failure, the process will be restarted when
the
 computer reboots. An added bonus is you don't have to login in the wee
 hours of the morning just to make sure the program is still running.
 
 --
 
 Just my 0.0002 million dollars worth,
 --- Shawn
 
 Probability is now one. Any problems that are left are your own.
 SS Heart of Gold, _The Hitchhiker's Guide to the Galaxy_
 
 --
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 http://learn.perl.org/ http://learn.perl.org/first-response
 


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/ http://learn.perl.org/first-response




Re: processing at preset times

2005-11-08 Thread John W. Krahn
Frank Bax wrote:
 I have script that takes a very long time to run - hours, sometimes even
 days (even on a P2-2.8Ghz machine.  After loading some data from a
 database at the beginning (less than a second), the script does no i/o
 until results are output at the end of script.  I'd like to know how the
 script is progressing through its data, so I added some code to update
 the database at regular data intervals, but this has some problems:
 - database is remote, so script goes much slower with these status
 updates.
 - updates are based on data instead of clock.
 
 I read something about threads in perl and was wondering if these status
 updates should be coded inside a thread so they have less impact on
 overall script performance.  The number crunching could still go on
 while database update happens in separate thread.
 
 To get updates based on clock instead of data...  Are there tools within
 perl for using clock/timer information?

perldoc Devel::DProf
perldoc Time::HiRes



John
-- 
use Perl;
program
fulfillment

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/ http://learn.perl.org/first-response