On Mon, Jul 02, 2007 at 01:55:08PM +0900, Heiga ZEN (Byung Ha CHUN) alleged:
> Hi all,
> 
> I'm using Maui with Torque.
> I checked my maui log file and found a strange(?) parts as
> 
> 
> 07/02 12:54:26 INFO:     checkpointing node 'p4-6'
> 07/02 12:54:26 INFO:     checkpointing node 'p4-7'
> ...
> 07/02 12:54:26 INFO:     checkpointing node 'pd4-13'
> 07/02 12:54:26 INFO:     checkpointing node '5958.jasmine'
> 07/02 12:54:26 INFO:     checkpointing node '5959.jasmine'
> ...
> 07/02 12:54:26 INFO:     checkpointing node '6044.jasmine'

This looks like an old bug in the pbs client libraries that was fixed
years ago.  Maui would issue a pbs_statnode() call, the data read had a
particular timeout, and the data would still be on the wire for the next
call to pbs_statjob().

You didn't say the version, but I assume an old version of TORQUE.
Update your TORQUE and rebuild Maui after installing the updated TORQUE
(updating Maui is not required for this particular bug).


-- 
Garrick Staples, GNU/Linux HPCC SysAdmin
University of Southern California

Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html
09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0

Attachment: pgpqDYKrbazof.pgp
Description: PGP signature

_______________________________________________
mauiusers mailing list
mauiusers@supercluster.org
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to