Hi,
your problem is related to this here:
[opencl_profiling] spent 6.8647 seconds in [Write Image (from host to
device)]
(taken from one of the OpenCL runs in your attached debug output).
This figure represents the time that is needed to transfer data from
host memory (your main ram) to the graphics card memory. This is a
processing step that is inherently slow for all OpenCL systems as data
need to travel through the slow PCI bus. However, in your case it's
really slow.
I am seeing something similar, albeit not to that extent, with an older
Geforce GTS450 which I use here on a dual-GPU system (about 1.5s in
Write Image...). My primary GPU, an AMD HD7950 runs with high speed
(about 0.02s in Write Image...). No sure if this is related to the
dual-GPU setup, though.
Maybe some of the other users of some low to mid-end Nvidia GPUs could
report their figures.
Ulrich
Am 21.03.2016 um 13:32 schrieb Jamie Kitson:
Hi,
I have an Asus UX31VD with 10GB RAM, an i7-3517U and both Intel HD4000
and NVidia GeForce GT 620M GPUs. When I switch OpenCL on in Darktable I
find it runs much slower, I think you can see that from these numbers:
[dev_pixelpipe] took 24.632 secs (22.407 CPU) processed `lens
correction' on GPU with tiling, blended on CPU [export]
[dev_pixelpipe] took 4.648 secs (17.720 CPU) processed `lens correction'
on CPU with tiling, blended on CPU [export]
[dev_pixelpipe] took 0.264 secs (0.320 CPU) processed `shadows and
highlights' on GPU, blended on GPU [full]
[dev_pixelpipe] took 0.050 secs (0.167 CPU) processed `shadows and
highlights' on CPU, blended on CPU [full]
Reading the Darktable OpenCL post [1] it seems that this could be caused
by lack of graphics memory. According various sources (Windows included)
my machine has 2GB of graphics RAM, and according to the Darktable
OpenCL post, Darktable itself won't try to use a GPU with less than
1GB(?) of memory, however the only ways I know of checking the amount of
graphics memory in Linux show less than 2GB, eg:
clinfo:
Global memory size 1073479680 (1024MiB)
Max memory allocation 268369920 (255.9MiB)
Unified memory for Host and Device No
Integrated memory (NV) No
Global Memory cache type Read/Write
Global Memory cache size 32768
Global Memory cache line 128 bytes
Local memory type Local
Local memory size 49152 (48KiB)
lspci:
Memory at f6000000 (32-bit, non-prefetchable) [size=16M]
Memory at e0000000 (64-bit, prefetchable) [size=256M]
Memory at f0000000 (64-bit, prefetchable) [size=32M]
lshw:
resources: irq:16 memory:f6000000-f6ffffff
memory:e0000000-efffffff memory:f0000000-f1ffffff
ioport:e000(size=128) memory:f7000000-f707ffff
Darktable:
[opencl_init] device 0 `GeForce GT 620M' allows GPU memory allocations
of up to 255MB
[opencl_init] device 0: GeForce GT 620M
GLOBAL_MEM_SIZE: 1024MB
MAX_WORK_GROUP_SIZE: 1024
MAX_WORK_ITEM_DIMENSIONS: 3
MAX_WORK_ITEM_SIZES: [ 1024 1024 64 ]
DRIVER_VERSION: 361.28
DEVICE_VERSION: OpenCL 1.1 CUDA
Actually dmesg does report 2GB:
[ 1.965859] [drm] Memory usable by graphics device = 2048M
So my questions are:
Am I right in thinking that the Darktable OpenCL slowness is likely down
to a lack of video memory?
From what I've read on the internet it seems that many people run
Darktable using OpenCL on Optimus systems without this issue, can anyone
vouch for that?
Does anyone have any idea in how I can either a) have Linux/OpenCL
recognise how much video memory I really have or b) speed up Darktable
with OpenCL?
Would the AMD/ATI instructions [2] help in my case?
Thanks, Jamie Kitson
____________________________________________________________________________
darktable user mailing list
to unsubscribe send a mail to [email protected]