[darktable-dev] OpenCL issues

2018-10-29 Thread Heiko Bauke

Hi,

I own a laptop with a low-end OpenCL capable graphics card.  Usually I 
explicitly turn OpenCL support off for darktable.


Today I enabled OpenCL support in darktable for some testing purposes. 
Starting darktable (current git master with some extensions not related 
to OpenCL) with the options '-d opencl -d perf' yields



0.062332 [opencl_init] opencl related configuration options:
0.062345 [opencl_init] 
0.062359 [opencl_init] opencl: 1

0.062361 [opencl_init] opencl_library: ''
0.062364 [opencl_init] opencl_memory_requirement: 200
0.062367 [opencl_init] opencl_memory_headroom: 0
0.062372 [opencl_init] opencl_device_priority: ''
0.062377 [opencl_init] opencl_mandatory_timeout: 0
0.062382 [opencl_init] opencl_size_roundup: 16
0.062386 [opencl_init] opencl_async_pixelpipe: 0
0.062389 [opencl_init] opencl_synch_cache: 0
0.062392 [opencl_init] opencl_number_event_handles: 0
0.062396 [opencl_init] opencl_micro_nap: 0
0.062399 [opencl_init] opencl_use_pinned_memory: 0
0.062402 [opencl_init] opencl_use_cpu_devices: 0
0.062404 [opencl_init] opencl_avoid_atomics: 0
0.062407 [opencl_init] 
0.062615 [opencl_init] found opencl runtime library 'libOpenCL'

0.062652 [opencl_init] opencl library 'libOpenCL' found on your system and 
loaded
0.078292 [opencl_init] found 1 platform
0.078310 [opencl_init] found 1 device
0.078526 [opencl_init] device 0 `GeForce GT 730M' has sm_20 support.
0.078613 [opencl_init] device 0 `GeForce GT 730M' supports image sizes of 16384 
x 16384
0.078619 [opencl_init] device 0 `GeForce GT 730M' allows GPU memory allocations 
of up to 501MB
[opencl_init] device 0: GeForce GT 730M 
 GLOBAL_MEM_SIZE:  2004MB

 MAX_WORK_GROUP_SIZE:  1024
 MAX_WORK_ITEM_DIMENSIONS: 3
 MAX_WORK_ITEM_SIZES:  [ 1024 1024 64 ]
 DRIVER_VERSION:   390.77
 DEVICE_VERSION:   OpenCL 1.2 CUDA
0.157247 [opencl_init] options for OpenCL compiler: -cl-fast-relaxed-math  
-DNVIDIA_SM_20=1 -DNVIDIA=1 
-I"/usr/local/darktable_guided/share/darktable/kernels"


[...]


0.177151 [opencl_init] compiling program `heal.cl' ..
0.177158 [opencl_fopen_stat] could not open file 
`/usr/local/darktable_guided/share/darktable/kernels/heal.cl'!
0.177163 [opencl_init] kernel loading time: 0.0198 
0.177170 [opencl_init] OpenCL successfully initialized.

0.177173 [opencl_init] here are the internal numbers and names of OpenCL 
devices available to darktable:
0.177176 [opencl_init]  0   'GeForce GT 730M'
0.177180 [opencl_init] FINALLY: opencl is AVAILABLE on this system.
0.177183 [opencl_init] initial status of opencl enabled flag is ON.
0.177205 [opencl_create_kernel] successfully loaded kernel `blendop_mask_Lab' 
(0) for device 0
0.177213 [opencl_create_kernel] successfully loaded kernel `blendop_mask_RAW' 
(1) for device 0
0.177222 [opencl_create_kernel] successfully loaded kernel `blendop_mask_rgb' 
(2) for device 0


[...]

Neglecting the fact that the kernel heal.cl cannot be loaded everything 
looks fine to me.  Nevertheless, all modules utilize CPUs only but not 
my GPU, including 'denoise (profiled)'.



38.392474 [dev_pixelpipe] took 0.405 secs (1.365 CPU) processed `denoise 
(profiled)' on CPU, blended on CPU [full]


How can I enable GPU processing?

In particular, I need to enable blending on GPU.  Currently I am working 
on automatic mask refinement based on a guided filter, see 
http://kaiminghe.com/publications/eccv10guidedfilter.pdf For this 
purpose I have extended the function dt_develop_blend_process and now I 
also have to ajust dt_develop_blend_process_cl.  But currently, the 
latter function is never called.  Any hint?



Heiko


--
-- Number Crunch Blog @ https://www.numbercrunch.de
--  Cluster Computing @ https://www.clustercomputing.de
--  Social Networking @ https://www.researchgate.net/profile/Heiko_Bauke
___
darktable developer mailing list
to unsubscribe send a mail to darktable-dev+unsubscr...@lists.darktable.org



Re: [darktable-dev] OpenCL issues

2018-10-29 Thread Aurélien Pierre
Hallo Heiko !

Very pleased to hear that (not the bug part, though).

Did you try with smaller pictures ? Usually, darktable falls back to CPU
when there is not enough ressource available on the GPU.

You can try :

 1. the command *nvidia-smi* to see how the GPU RAM is used (if there is
not enough vRAM available, you will see an OpenCL error code -4).
 2. setting opencl_async_pixelpipe=true in darktablerc
 3. setting opencl_mandatory_timeout > 200 in darktablerc

Also, I have discovered this week that Gnome 3.28.2 with Xorg has
serious memory leaks issues and can affect OpenCL performance. After
several hours of uptime, Xorg consumes up to 1GB RAM/vRAM on Ubuntu
18.04, so OpenCL has not enough space.

I hope this helps,

Aurélien.

Le 29/10/2018 à 17:23, Heiko Bauke a écrit :
> Hi,
>
> I own a laptop with a low-end OpenCL capable graphics card.  Usually I
> explicitly turn OpenCL support off for darktable.
>
> Today I enabled OpenCL support in darktable for some testing purposes.
> Starting darktable (current git master with some extensions not
> related to OpenCL) with the options '-d opencl -d perf' yields
>
>> 0.062332 [opencl_init] opencl related configuration options:
>> 0.062345 [opencl_init] 0.062359 [opencl_init] opencl: 1
>> 0.062361 [opencl_init] opencl_library: ''
>> 0.062364 [opencl_init] opencl_memory_requirement: 200
>> 0.062367 [opencl_init] opencl_memory_headroom: 0
>> 0.062372 [opencl_init] opencl_device_priority: ''
>> 0.062377 [opencl_init] opencl_mandatory_timeout: 0
>> 0.062382 [opencl_init] opencl_size_roundup: 16
>> 0.062386 [opencl_init] opencl_async_pixelpipe: 0
>> 0.062389 [opencl_init] opencl_synch_cache: 0
>> 0.062392 [opencl_init] opencl_number_event_handles: 0
>> 0.062396 [opencl_init] opencl_micro_nap: 0
>> 0.062399 [opencl_init] opencl_use_pinned_memory: 0
>> 0.062402 [opencl_init] opencl_use_cpu_devices: 0
>> 0.062404 [opencl_init] opencl_avoid_atomics: 0
>> 0.062407 [opencl_init] 0.062615 [opencl_init] found opencl runtime
>> library 'libOpenCL'
>> 0.062652 [opencl_init] opencl library 'libOpenCL' found on your
>> system and loaded
>> 0.078292 [opencl_init] found 1 platform
>> 0.078310 [opencl_init] found 1 device
>> 0.078526 [opencl_init] device 0 `GeForce GT 730M' has sm_20 support.
>> 0.078613 [opencl_init] device 0 `GeForce GT 730M' supports image
>> sizes of 16384 x 16384
>> 0.078619 [opencl_init] device 0 `GeForce GT 730M' allows GPU memory
>> allocations of up to 501MB
>> [opencl_init] device 0: GeForce GT 730M 
>> GLOBAL_MEM_SIZE:  2004MB
>>  MAX_WORK_GROUP_SIZE:  1024
>>  MAX_WORK_ITEM_DIMENSIONS: 3
>>  MAX_WORK_ITEM_SIZES:  [ 1024 1024 64 ]
>>  DRIVER_VERSION:   390.77
>>  DEVICE_VERSION:   OpenCL 1.2 CUDA
>> 0.157247 [opencl_init] options for OpenCL compiler:
>> -cl-fast-relaxed-math  -DNVIDIA_SM_20=1 -DNVIDIA=1
>> -I"/usr/local/darktable_guided/share/darktable/kernels"
>
> [...]
>
>> 0.177151 [opencl_init] compiling program `heal.cl' ..
>> 0.177158 [opencl_fopen_stat] could not open file
>> `/usr/local/darktable_guided/share/darktable/kernels/heal.cl'!
>> 0.177163 [opencl_init] kernel loading time: 0.0198 0.177170
>> [opencl_init] OpenCL successfully initialized.
>> 0.177173 [opencl_init] here are the internal numbers and names of
>> OpenCL devices available to darktable:
>> 0.177176 [opencl_init]    0    'GeForce GT 730M'
>> 0.177180 [opencl_init] FINALLY: opencl is AVAILABLE on this system.
>> 0.177183 [opencl_init] initial status of opencl enabled flag is ON.
>> 0.177205 [opencl_create_kernel] successfully loaded kernel
>> `blendop_mask_Lab' (0) for device 0
>> 0.177213 [opencl_create_kernel] successfully loaded kernel
>> `blendop_mask_RAW' (1) for device 0
>> 0.177222 [opencl_create_kernel] successfully loaded kernel
>> `blendop_mask_rgb' (2) for device 0
>
> [...]
>
> Neglecting the fact that the kernel heal.cl cannot be loaded
> everything looks fine to me.  Nevertheless, all modules utilize CPUs
> only but not my GPU, including 'denoise (profiled)'.
>
>> 38.392474 [dev_pixelpipe] took 0.405 secs (1.365 CPU) processed
>> `denoise (profiled)' on CPU, blended on CPU [full]
>
> How can I enable GPU processing?
>
> In particular, I need to enable blending on GPU.  Currently I am
> working on automatic mask refinement based on a guided filter, see
> http://kaiminghe.com/publications/eccv10guidedfilter.pdf For this
> purpose I have extended the function dt_develop_blend_process and now
> I also have to ajust dt_develop_blend_process_cl.  But currently, the
> latter function is never called.  Any hint?
>
>
> Heiko
>
>

___
darktable developer mailing list
to unsubscribe send a mail to darktable-dev+unsubscr...@lists.darktable.org

Re: [darktable-dev] OpenCL issues

2018-10-29 Thread Heiko Bauke

Am 29.10.18 um 22:55 schrieb Aurélien Pierre:

You can try :

 1. the command *nvidia-smi* to see how the GPU RAM is used (if there is
not enough vRAM available, you will see an OpenCL error code -4).
 2. setting opencl_async_pixelpipe=true in darktablerc
 3. setting opencl_mandatory_timeout > 200 in darktablerc


This did not help.  Finally, I set explicitly

opencl_device_priority=*/!0,*/*/*

which is according to the documentation is the default.  Now the GPU is 
enabled except for the preview pixelpipe, as also indicated by the log:



0.208921 [opencl_priorities] these are your device priorities:
0.208925 [opencl_priorities]image   preview export  thumbnail
0.208934 [opencl_priorities]0   -1  0   0
0.208941 [opencl_priorities] show if opencl use is mandatory for a given 
pixelpipe:
0.208945 [opencl_priorities]image   preview export  thumbnail
0.208953 [opencl_priorities]0   0   0   0


The default

opencl_device_priority=

, however, yields on my laptop


0.209700 [opencl_priorities] these are your device priorities:
0.209703 [opencl_priorities]image   preview export  thumbnail
0.209711 [opencl_priorities]-1  -1  -1  -1
0.209716 [opencl_priorities] show if opencl use is mandatory for a given 
pixelpipe:
0.209719 [opencl_priorities]image   preview export  thumbnail
0.209724 [opencl_priorities]0   0   0   0


I.e., no pixelpipe is processed on the GPU.


Heiko



--
-- Number Crunch Blog @ https://www.numbercrunch.de
--  Cluster Computing @ https://www.clustercomputing.de
--  Social Networking @ https://www.researchgate.net/profile/Heiko_Bauke
___
darktable developer mailing list
to unsubscribe send a mail to darktable-dev+unsubscr...@lists.darktable.org



Re: [darktable-dev] OpenCL issues

2018-10-29 Thread Ulrich Pegelow

Am 29.10.18 um 23:35 schrieb Heiko Bauke:


This did not help.  Finally, I set explicitly

opencl_device_priority=*/!0,*/*/*

which is according to the documentation is the default.  Now the GPU is 
enabled except for the preview pixelpipe, as also indicated by the log:



0.208921 [opencl_priorities] these are your device priorities:
0.208925 [opencl_priorities] image    preview    export
thumbnail

0.208934 [opencl_priorities]    0    -1    0    0
0.208941 [opencl_priorities] show if opencl use is mandatory for a 
given pixelpipe:
0.208945 [opencl_priorities] image    preview    export
thumbnail

0.208953 [opencl_priorities]    0    0    0    0


The default

opencl_device_priority=

[...]



The empty string is not the default for opencl_device_priority, default 
is "*/!0,*/*/*". Please note that manual device selection by this 
parameter is only effective if opencl_scheduling_profile=default.


Ulrich


___
darktable developer mailing list
to unsubscribe send a mail to darktable-dev+unsubscr...@lists.darktable.org



Re: [darktable-dev] OpenCL issues

2018-10-30 Thread Heiko Bauke

Dear Ulrich,

Am 30.10.18 um 07:07 schrieb Ulrich Pegelow:

The empty string is not the default for opencl_device_priority, default
is "*/!0,*/*/*". Please note that manual device selection by this
parameter is only effective if opencl_scheduling_profile=default.


I checked what is written to darktablerc on a fresh installation.  It's

opencl_device_priority=*/!0,*/*/*

indeed.  My darktablerc was quite old.  Possibly darktable did behave 
differently in the past or I have messed up the configuration file such 
that the argument of opencl_device_priority was set to empty.



Heiko


--
-- Number Crunch Blog @ https://www.numbercrunch.de
--  Cluster Computing @ https://www.clustercomputing.de
--  Social Networking @ https://www.researchgate.net/profile/Heiko_Bauke
___
darktable developer mailing list
to unsubscribe send a mail to darktable-dev+unsubscr...@lists.darktable.org



Re: [darktable-dev] OpenCL issues

2018-10-30 Thread Andreas Schneider
On Monday, 29 October 2018 22:55:44 CET Aurélien Pierre wrote:
> Hallo Heiko !
> 
> Very pleased to hear that (not the bug part, though).
> 
> Did you try with smaller pictures ? Usually, darktable falls back to CPU
> when there is not enough ressource available on the GPU.
> 
> You can try :
> 
>  1. the command *nvidia-smi* to see how the GPU RAM is used (if there is
> not enough vRAM available, you will see an OpenCL error code -4).
>  2. setting opencl_async_pixelpipe=true in darktablerc
>  3. setting opencl_mandatory_timeout > 200 in darktablerc
> 
> Also, I have discovered this week that Gnome 3.28.2 with Xorg has
> serious memory leaks issues and can affect OpenCL performance. After
> several hours of uptime, Xorg consumes up to 1GB RAM/vRAM on Ubuntu
> 18.04, so OpenCL has not enough space.

Normally you need to select "very fast gpu" that darktable runs more stuff via 
OpenCL and this bright a huge boost when processing.

However this doesn't work for me right now. I've started to use the ROCm open 
soruce stuff from AMD but it doesn't have image support yet.


Andreas

-- 
Andreas Schneider a...@cryptomilk.org
GPG-ID: 8DFF53E18F2ABC8D8F3C92237EE0FC4DCC014E3D


___
darktable developer mailing list
to unsubscribe send a mail to darktable-dev+unsubscr...@lists.darktable.org



Re: [darktable-dev] OpenCL issues

2018-10-30 Thread sturmflut
Dear list,

On 30.10.18 11:16, Andreas Schneider wrote:

> Normally you need to select "very fast gpu" that darktable runs more stuff 
> via 
> OpenCL and this bright a huge boost when processing.
> 
> However this doesn't work for me right now. I've started to use the ROCm open 
> soruce stuff from AMD but it doesn't have image support yet.

I'm using AMDGPU-PRO for my RX570 (likely everybody else, probably). It
would be nice if the whole ROCm open source stuff was easy to install,
fully featured and reliable, but as it is even AMDGPU-PRO still has its
problems. Darktable with OpenCL works fine, but e.g. Blender crashes if
I activate the OpenCL backend.

kind regards,
Simon
___
darktable developer mailing list
to unsubscribe send a mail to darktable-dev+unsubscr...@lists.darktable.org



Mask feathering (was Re: [darktable-dev] OpenCL issues)

2018-10-30 Thread Heiko Bauke

Hi,

Am 29.10.18 um 22:55 schrieb Aurélien Pierre:

Very pleased to hear that (not the bug part, though).

[...]
In particular, I need to enable blending on GPU.  Currently I am 
working on automatic mask refinement based on a guided filter, see 
http://kaiminghe.com/publications/eccv10guidedfilter.pdf


I have first fully working implementation of the automatic mask 
refinement algorithm that works with in the CPU pixelpipe as well as in 
the GPU pixelpipe.  Currently, all data is just copied to host memory, 
processed by the CPU and the copied back to GPU memory.  The actual GPU 
implementation of the guided filter is still lacking.  But I am working 
on that.  See https://github.com/rabauke/darktable/tree/guided_filter 
for details.


The usual warning: start darktable with --confdir option, in order not 
to overwrite your working darktable database and configuration if you 
play with this darktable branch.



Heiko


--
-- Number Crunch Blog @ https://www.numbercrunch.de
--  Cluster Computing @ https://www.clustercomputing.de
--  Social Networking @ https://www.researchgate.net/profile/Heiko_Bauke
___
darktable developer mailing list
to unsubscribe send a mail to darktable-dev+unsubscr...@lists.darktable.org