Re: [boinc_dev] [boinc_projects] new credit design

Raistmer Tue, 10 Nov 2009 19:52:55 -0800

> I'll make sure that in the new design:
>
> 1) error rate and max jobs/day are maintained for each
>    (host, app version), rather than for the host as a whole
This will not help with app_info where app version can be setted arbitrary.
Again, look at my scenario. There are 2 devices in host. Same host, same 
app, same CUDA version, same driver
But one works OK constantly - the other - not (but it not just broken - 
that's the difficulty of situation).
You will end with maintaning separate queue for each device on host and 
completely overloaded project servers that will handle tons of actually 
unneeded info.


> 2) max jobs/day is maintained more conservatively, so that
>    if 1 out of N GPUs is returning bad results,
>    the host will get few GPU jobs (e.g. 1 per day)
Then you have good chances to decrease overall performance with new scenario 
(even going to adaprive replication) instead of increasing it, but it will 
not solve fundamental problem of adaptive replication:
each individual returned result will have MUCH, really MUCH lower confidence 
level regarding to redundancy of 2 approach.
If project can rely only on set of results as whole, maybe it's not so bad. 
But if each separate result needed (and I think for SETI it's the case, 
cause even persistance check that uses set of results will probably fail on 
single incorrect result until we have each point of sky observed truly 
_many_ (tens times maybe? ) times), such decrease in confidence level will 
ruin trust to project results.
IMO we dont need many but untrusted results, they will have no value at 
all...

>
> -- David
>
> Raistmer wrote:
>>> Consider a possible scenario where the children are allowed to use the 
>>> host
>>> for gaming when they have finished their homework, and the games leave 
>>> the
>>> GPU in a bad state. Such a host could transition from reliable to 
>>> unreliable
>>> daily, and hundreds of corrupted results could be assimilated each time. 
>>> If
>>> the host were turned off at bedtime, it would be in reliable condition 
>>> when
>>> turned on the next day.
>>>
>>> The daily quota is no protection for scenarios like that if the host is 
>>> also
>>> doing CPU work for the same project. All it takes is one good CPU result 
>>> for
>>> each 49 bad GPU results to keep a daily quota of 100 at max.
>>> -- 
>>>                                                            Joe
>>
>> Just the same I've seen on one of my PCs with dual GPU.
>> 9400GT go mad time to time (maybe system overheat, maybe some another 
>> reason) and starts to produce legal but bad results.
>> And this case even worse (in regards to quota) than described above cause 
>> the same host has another fast (relative) GPU - 9600GSO - that continue 
>> to produce correct results.
>> It + CPU surely can keep quota far from zero... and even possible 
>> discrimination between CPU and GPU quotas will not help in this case... 
>> _______________________________________________
>> boinc_dev mailing list
>> boinc_dev@ssl.berkeley.edu
>> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
>> To unsubscribe, visit the above URL and
>> (near bottom of page) enter your email address.
> 

_______________________________________________
boinc_dev mailing list
boinc_dev@ssl.berkeley.edu
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Re: [boinc_dev] [boinc_projects] new credit design

Reply via email to