See the ACTIVE_TASK::handle_premature_exit() function in app_control.cpp for 
the actual implementation. It's 100 rather than the 10 you propose, so 
theoretically about 53 minutes (100*~32 seconds). I have seen a few Task detail 
pages showing the "too many exit(0)s" error message.

If that were to be reduced, I think the core client should have some kind of 
check whether it has actually sent a heartbeat to the application recently. 
Also it might make sense for the temporary exit capability to be given a larger 
count or maybe even exempted.
-- 
                                                        Joe



On Fri, 05 Aug 2011 08:33:27 -0400, <[email protected]> wrote:

> I had a fix for #3 once upon a time.  If a task resets to the very same
> check point ten times in a row, it should be declared dead.  Note that if
> it moves on to the next checkpoint, the count should be restarted.
>
> Not having this check is a way for wasting days of CPU time.
>
> jm7
>
>
>             robert miles
>              <robertmiles@bell
>              south.net>                                                 To
>              Sent by:                  <[email protected]>
>              <boinc_dev-bounce                                          cc
>              [email protected]
>              u>                                                    Subject
>                                        Re: [boinc_dev] 0% progress for
>                                        Pentium(R) Dual-Core CPU and
>              08/04/2011 07:57          Windows 7 combination
>              PM
>
>
>
>
> I've seen a few fairly similar problems you and the users may want to
> check for:
>
> 1.  The workunit crashes in such a way that it stops using any CPU
> time at all, and does NOT tell the BOINC client this has happened.
> It may or may not tell Windows something has gone wrong.  This
> often freezes the heartbeat tests so it may even go well past the time
> when it should have halted due to running too long.
>
> 2.  A few programs give different results depending on whether
> they run on Intel CPUs or AMD CPUs.
>
> 3.  Some workunits repeatedly end without an error code, but
> also without producing the output file to tell the BOINC client
> that it finished properly.  Should be visible in at least some of
> the log files.
>
> 4.  Are the checkpoints closely spaced enough that those users
> should be able to reach them even with the following combination
> of settings?
>
>     a.  The usual 60 minutes before possibly going on to some
>     other BOINC workunit from a different project.
>
>     b.  Only 60% of the CPU time available due to running on
>     a laptop.
>
>     c. Enough other BOINC projects enabled with enough
>     resource share that if the 60 minutes is reached, the
>     workunit will usually be suspended.
>
>     d. Either the option to keep suspended workunits in
>     memory is not enabled, or BOINC is not allowed to use
>     enough memory to do it.
>
> 5. A few workunits do not have the progress reporting
> working properly, but go on to produce correct results
> otherwise.
>
> 6. Is there any difference between running with graphics
> enabled and running without graphics enabled (if you even
> have a screensaver section yet)?
>
> 7. Does recompiling the application specifically for that
> CPU and OS combination offer any improvement on the
> machines with the problem?
>
> Robert Miles
>
>
>
>> Date: Thu, 4 Aug 2011 10:07:47 -0400
>> From: Boyu Zhang <[email protected]>
>> Subject: [boinc_dev] 0% progress for Pentium(R) Dual-Core CPU and
>  >   Windows 7    combination
>
>> Hi All,
>
>> We have a boinc project distributing from our lab, and I observed from
> the
>> database that there is a high percentage of hosts with CPU Pentium(R)
>> Dual-Core CPU and operating system Windows 7 has the same problem. The
>> problem is that the boinc application running on the hosts constantly has
> 0%
>> progress no matter how long it runs. But on Linux system we don't see the
>> same problem.
>
>> Moreover, some hosts with exact the same cpu and os version, but some of
>> them are successful while others are having the problem, we tried to
>> reproduce the situation on the machines in the lab but with no luck so
> far.
>> Did anyone have the same problem? Any help and suggestions is
> appreciated,
>> thank you!
>
>> Boyu
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Reply via email to