On 13-10-05 03:04 PM, Christian Kanzian wrote:
> Am Samstag, 5. Oktober 2013 schrieb Patrick Shanahan:
>> * Ochal Christophe <[email protected]> [10-05-13 14:42]:
>>> On 2013-10-05 20:32, Tobias Ellinghaus wrote:
>>>> Am Samstag, 5. Oktober 2013, 19:44:37 schrieb Ochal Christophe:
>>>>
>>>> [...]
>>>>
>>>>> My desktop consists of an i7 with 16GB ram, NVidia card, mSATA SSD
>>>>> (boot disk) and 2 large SATA drives in mirroring mode and it still
>>>>> takes 15 to 30 seconds to process an image from my Nikon D00
>>>>
>>>> Is that for exporting or clicking around in darkroom mode?
>>>
>>> Exporting, I'd go nuts otherwise :D
>>
>> odd, my i7 w/12gb and system on ssd w/2 large sata drives @ raid 1 takes
>> 5-6 sec's per image and *most* images are 24mb/raw from D7100. Images
>> from my 12mb/raw D3 take 3-5 sec's.
>>> But since I usually process 600-1500 images at a time and export at the
>> end, export time makes little difference to me as I usually walk away or
>> do something else in the mean tine.
>>
>> Still intending expanding ram to 24mb in the very near future. I believe
>> the amount of ram on your video card is a large determinator re speed.
>
> So for my setup (i7 2600; 16 GB RAM; GeForce GT 640 2 GB RAM with nivida-
> driver 325.15; no SSD; Pentax K5 DNG) it take from 2 to 3 sec's. 15-20 sec's
> with denoising.
>
> Half a year ago I replaced my nvidia card (512 MB) with a the GT 640 (2024 MB)
> to get OpenCL working, but OpenCL is slower than CPU only in my case. Also X11
> is not responsive during export with OpenCL enable. I tried different setting
> as suggest, but with no big changes.
>
> Without OpenCl: [dev_process_export] pixel pipeline processing took 13.919
> secs (89.910 CPU)
>
> With OpenCL[dev_process_export] pixel pipeline processing took 24.525 secs
> (17.977 CPU)
>
> Export time heavily depends on the applied image settings/modules. To compare
> different setups a kind of benchmark - same RAW and process - should be done.
>
>
> I would take:
> - 16 GB RAM
> - i7 CPU
> - recent nvidia card with 2 GB RAM
> - SSD for the system
> - 2x2TB drives with RAID 1
>
> All the best,
> Christian
>
> chri@chk64:~$ darktable -d opencl -d perf
> [opencl_init] device 0 `GeForce GT 640' has sm_20 support.
> [opencl_init] device 0 `GeForce GT 640' supports image sizes of 32768 x 32768
> [opencl_init] device 0 `GeForce GT 640' allows GPU memory allocations of up to
> 511MB
> [opencl_init] device 0: GeForce GT 640
> GLOBAL_MEM_SIZE: 2047MB
> MAX_WORK_GROUP_SIZE: 1024
> MAX_WORK_ITEM_DIMENSIONS: 3
> MAX_WORK_ITEM_SIZES: [ 1024 1024 64 ]
> DRIVER_VERSION: 325.15
> DEVICE_VERSION: OpenCL 1.1 CUDA
>
Similar setup here as with Christian but with a bit older video card:
[opencl_init] device 0 `GeForce GT 630' has sm_20 support.
[opencl_init] device 0 `GeForce GT 630' supports image sizes of 32768 x
32768
[opencl_init] device 0 `GeForce GT 630' allows GPU memory allocations of
up to 511MB
[opencl_init] device 0: GeForce GT 630
GLOBAL_MEM_SIZE: 2047MB
MAX_WORK_GROUP_SIZE: 1024
MAX_WORK_ITEM_DIMENSIONS: 3
MAX_WORK_ITEM_SIZES: [ 1024 1024 64 ]
DRIVER_VERSION: 304.88
DEVICE_VERSION: OpenCL 1.1 CUDA
Note the older driver version being used.
I also shoot Pentax (K5IIs in this case) and work with DNG files when
doing astrophotography due to DeepSkyStacker not being able to handle
PEF files. After a tweaking session (mainly adjust tone curve, slight
denoising since DSS already does most of the heavy lifting with that,
saturation boost via color zones). Here is the final export run:
pixelpipe_process] [export] using device 0
[dev_pixelpipe] took 0.000 secs (0.000 CPU) initing base buffer [export]
[dev_pixelpipe] took 0.144 secs (0.156 CPU) processing `white balance'
[export]
[dev_pixelpipe] took 0.044 secs (0.012 CPU) processing `highlight
reconstruction' [export]
[default_process_tiling_cl_ptp] use tiling on module 'denoiseprofile'
for image with full size 4950 x 3284
[default_process_tiling_cl_ptp] (2 x 1) tiles with max dimensions 4100 x
3284 and overlap 32
[default_process_tiling_cl_ptp] tile (0, 0) with 4100 x 3284 at origin
[0, 0]
[opencl_denoiseprofile] couldn't enqueue kernel! -4, devid 0
[default_process_tiling_opencl_ptp] couldn't run process_cl() for module
'denoiseprofile' in tiling mode: 0
[opencl_pixelpipe] failed to run module 'denoiseprofile'. fall back to
cpu path
[dev_pixelpipe] took 12.437 secs (39.652 CPU) processing `denoise
(profiled)' [export]
[dev_pixelpipe] took 0.207 secs (0.172 CPU) processing `input color
profile' [export]
[dev_pixelpipe] took 0.123 secs (0.040 CPU) processing `color zones'
[export]
[dev_pixelpipe] took 0.065 secs (0.024 CPU) processing `tone curve' [export]
[dev_pixelpipe] took 0.069 secs (0.028 CPU) processing `levels' [export]
[dev_pixelpipe] took 0.082 secs (0.036 CPU) processing `output color
profile' [export]
[dev_pixelpipe] took 0.152 secs (0.076 CPU) processing `channel mixer'
[export]
[opencl_profiling] spent 0.2753 seconds in [Write Image (from host to
device)]
[opencl_profiling] spent 0.0415 seconds in whitebalance_4f
[opencl_profiling] spent 0.0426 seconds in highlights_4f
[opencl_profiling] spent 0.3006 seconds in [Read Image (from device to
host)]
[opencl_profiling] spent 0.0491 seconds in denoiseprofile_precondition
[opencl_profiling] spent 1.0551 seconds in denoiseprofile_decompose
[opencl_profiling] spent 0.1123 seconds in colorin
[opencl_profiling] spent 0.1210 seconds in colorzones
[opencl_profiling] spent 0.0619 seconds in tonecurve
[opencl_profiling] spent 0.0658 seconds in levels
[opencl_profiling] spent 0.0775 seconds in colorout
[opencl_profiling] spent 0.0453 seconds in channelmixer
[opencl_profiling] spent 0.0200 seconds in blendop_mask_rgb
[opencl_profiling] spent 0.0782 seconds in blendop_rgb
[opencl_profiling] spent 2.3464 seconds totally in command queue (with
1 event missing)
[dev_process_export] pixel pipeline processing took 13.704 secs (41.432 CPU)
Not sure why opencl cannot handle denoising with the GT 630 but it
consumes the bulk of the pipeline processing time.
Jack
------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
Darktable-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/darktable-users