On 12 June 2015 at 22:47, David Anderson <[email protected]> wrote:

> Charlie found a bug that causes an inter-job pause on Mac/Linux.
> I still don't know what's the cause on Win,
> but I added another slot_debug message that will provide more info.
> Rom, can you make a private drop for Richard?
> Thanks -- David
>

Richard, does GPUGRID tasks specify two optional output files, COLVAR and
log.file? And are those and HILLS tagged with <copy_file/>?

David, if you look at the Procmon log Richard sent earlier. After the
GPUGRID app has exited, the client tries to open COLVAR and log.file files
multiple and fails. Then it renames HILLS file into what looks like a
regular output file name and reads stderr.txt.

Once the client has done all that, then it queries for the files in the
slot directory and deletes them.

That sequence of operations doesn't look like what the directory clean up
code does. I think the pause happens earlier, in copy_output_files() ->
boinc_rename() .

In the BOINC log snippet it looks different but that's because the
timestamps there can't be trusted. msg_printf() -> show_message() uses not
the real current time but gstate.now for timestamps and because the client
has been sleeping the timestamps are wrong.

-Juha
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Reply via email to