Re: [boinc_dev] CPU throttling and GPU apps

Richard Haselgrove Mon, 08 Jul 2013 10:41:32 -0700

What do we mean by 'use' a certain fraction of a CPU, anyway?

AFAIK, projects have a rather crude tool by which they declare what proportion 
of an application's fpops are performed on the CPU - <cpu_frac>x</cpu_frac> in 
the xml format of the plan class definition - and nothing else. The *scheduled* 
CPU usage (and I assume this is what David is referring to) is calculated by 
the server from this frac and the relative speeds of the host's CPU and GPU.


These limited tools can lead to assumptions which are widely divergent from 
reality, in either direction.

A concrete example: I run two hosts on GPUGrid (who should know a thing or two 
about GPU programming)

Host 43404 is told to schedule 0.55 CPUs for each GPU task: host 132158 is told 
to schedule 0.667 CPUs. Not a great deal of difference.

But compare the reality.
http://www.gpugrid.net/results.php?hostid=43404&state=3 uses typically ~900 CPU 
seconds per task, just 0.06 CPUs
http://www.gpugrid.net/results.php?hostid=132158&state=3 uses ~30,000 CPU 
seconds per task, or over 0.99 CPUs


Both hosts are running CUDA 4.2 applications/runtime, but there are many 
differences between them.

GPU type: Fermi/Kepler
GPU driver: 310.70/314.22
Operating system: WinXP-32/Win7-64

BOINC version: v6.12.34/v7.0.64
BOINC mode: Service/User
Application: Short (v6.52)/Long (v6.18)

We could - conceivably - program the BOINC platform to take account of all 
these variables, but I suspect that would be absurdly complicated, and quite 
probably fruitless (from reading the project message boards, I suspect the main 
cause of the difference between 0.06 CPU and 0.99 CPU usage is the current 
handling of the newer Kepler architecture - so an application-level, rather 
than BOINC, issue)

All-in-all, I think it would be better to go back to using the CPU 
throttle/thermal control feature for its original purpose - controlling the 
thermal output of pure-CPU apps (defined as apps which have *no* co-processor 
specified). If we need a thermal control mechanism for GPUs (which was the 
original request by Admin Team "St.Petersburg"), add one properly tailored to 
the hardware characteristics of co-processors.



>________________________________
> From: "McLeod, John" <john.mcl...@sap.com>
>To: David Anderson <da...@ssl.berkeley.edu>; "boinc_dev@ssl.berkeley.edu" 
><boinc_dev@ssl.berkeley.edu> 
>Sent: Monday, 8 July 2013, 16:21
>Subject: Re: [boinc_dev] CPU throttling and GPU apps
> 
>
>How about changing it to not throttle apps that use less than the current 
>throttling value?  E.g. if throttling is set at .9, don't throttle a task that 
>uses .8.
>
>-----Original Message-----
>From: boinc_dev [mailto:boinc_dev-boun...@ssl.berkeley.edu] On Behalf Of David 
>Anderson
>Sent: Friday, July 05, 2013 11:50 PM
>To: boinc_dev@ssl.berkeley.edu
>Subject: Re: [boinc_dev] CPU throttling and GPU apps
>
>I changed it not throttle apps that use < .5 CPUs
>-- David
>
>On 04-Jul-2013 2:38 PM, Eric J Korpela wrote:
>> The only pro I can think of would be to reduce GPU use to keep
>> temperature or power use down, but that would be better implemented as
>> GPU throttling.
>>
>> On Thu, Jul 4, 2013 at 5:52 AM, Bernd Machenschalk
>> <bernd.machensch...@aei.mpg.de> wrote:
>>> On 04.07.13 13:15, Heinz-Bernd Eggenstein wrote:
>>>
>>>> I guess there are several pros and cons, e.g.:
>>>>
>>>> cons:
>>>>     - one one hand, GPU apps (depending on the CPU share?) get a higher OS
>>>> prio (in terms of "niceness") to prevent the GPU being starved. Throttling
>>>> the CPU might very well cause this starvation
>>>>     - if a GPU app has a rather low CPU runtime share in the first place,
>>>> further CPU throttling does not seem too useful.
>>>>     - in order to avoid GPU load to interfere with the user doing non-BOINC
>>>> related stuff, there is already the setting "Suspend GPU work while
>>>> computer is in use".
>>>
>>>
>>> Here's one more:
>>>
>>> When not synchronized with GPU-CPU communication (kernel launches, data
>>> transfer) throtteling an App can break any running GPU task. I'm not sure
>>> whether the throtteling implementations of all BOINC Clients that are being
>>> used properly honor critical sections, nor am I that all GPU apps of all
>>> projects make proper use of these.
>>>
>>>
>>>> pros:
>>>>     I can't think about many
>>>
>>>
>>> Actually I can't think about any.
>>>
>>> Best,
>>> Bernd
>>>
>>>
>>> _______________________________________________
>>> boinc_dev mailing list
>>> boinc_dev@ssl.berkeley.edu
>>> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
>>> To unsubscribe, visit the above URL and
>>> (near bottom of page) enter your email address.
>> _______________________________________________
>> boinc_dev mailing list
>> boinc_dev@ssl.berkeley.edu
>> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
>> To unsubscribe, visit the above URL and
>> (near bottom of page) enter your email address.
>>
>_______________________________________________
>boinc_dev mailing list
>boinc_dev@ssl.berkeley.edu
>http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
>To unsubscribe, visit the above URL and
>(near bottom of page) enter your email address.
>_______________________________________________
>boinc_dev mailing list
>boinc_dev@ssl.berkeley.edu
>http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
>To unsubscribe, visit the above URL and
>(near bottom of page) enter your email address.
>
>
>
_______________________________________________
boinc_dev mailing list
boinc_dev@ssl.berkeley.edu
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Re: [boinc_dev] CPU throttling and GPU apps

Reply via email to