Hi Holger,

On Tue, Dec 17, 2019 at 12:26 AM Holger Wünsche <
holger.o.wuens...@t-online.de> wrote:

> Hi Jochen,
>
>
> I used your command from the other email to export this image. It took
> 50s  (10s faster than yours), spending ~40s in the pixel pipeline (also
> 10s less than you). I used the opportunity to look at the System-usage
> and I would say the GPU is not the bottleneck. When looking at nvtop
> (linux program to monitor nvidia-gpu usage) the GPU was mostly running
> idle. Disabling OpenCL only had minimal impact. As a result I would say
> the CPU is what slows the export down. This is somewhat supported by the
> measurements from Al using his 8-Core CPU. While for my System using
> more cores doesn't scale that well: 1 core took 130s, 2 cores took 80s
> and 4 cores (+Hyperthreading) still needed 48s.
>
> The most expensive modules are the exposure 1+2 and tone curve 3. These
> are the three modules with masks. When removing them the time is down to
> 6s.
>

thanks for taking the time to export the file on your machine. Your
observation is in line with my own. The exposure and tone curve modules
take most of the time. Both use masks, but not drawn ones like Ulrich
suggested (at least afair, I'd have to look it up). However, I usually make
broad use of parametric masks with feathering. If that's the performance
bottleneck, then I'll cross my fingers for the improvements Ulrich
mentioned.

Thanks and best wishes,

  Jochen


Best regards,
>
> Holger
>
> On 12/17/19 12:02 AM, Аl Воgnеr wrote:
> > Am Mon, 16 Dec 2019 13:01:32 +0100
> > schrieb Holger Wünsche <holger.o.wuens...@t-online.de>:
> >
> >> Hi Jochen,
> >>
> >>
> >> just looking at the numbers on Wikipedia [1] the RTX2060 Super has
> >> approx 60% more computation power (GFLOPS) and double the memory
> >> bandwidth. However I don't know if you would see speed improvements
> >> using another GPU, because the bottleneck might be something else. If
> >> you want you can send me (not per email ;) ) one of your images (and
> >> xmp-file) and I export it on my computer (i7 6700k, RTX2060 (non
> >> Super)) and measure the time it takes to export.
> >>
> >> Others might know how to find the bottleneck and tell you what limits
> >> your export times
> > Hi,
> >
> > after searching a lot in the web I decided to buy a NVIDIA Corporation
> > TU116 [GeForce GTX 1660]. Note: the more expensive TI is not remarkable
> > faster, nearly the same speed, depends on the photo. I think this card
> > has the best price-value. If you are not adventurous with Linux, use
> > Nvidia and / or search for AMD and troubles. It doesn't help if you
> > read it will get better and what could be maybe. Note the differences
> > if you compare. At the end your system must work now and not maybe some
> > day, IMHO the cpu is not so important, if you compare the GPU.
> >
> > Would be interested how long the RTX2060 takes with the bench-file.
> >
> > Try to install the phoronix test suite to get compareable test-files
> > and do something like below:
> >
> > You can download the bench-file here, but the phoronix suite contains
> > more test files to compare:
> > https://math.dartmouth.edu/~sarunas/bench_raw/
> >
> > $ darktable-cli bench.srw bench.srw.xmp bench.jpg --core -d perf -d
> > opencl
> >
> > ...
> >
> > 0.147530 [opencl_init] device 0 `GeForce GTX 1660' has sm_20 support.
> >
> > 0.147660 [opencl_init] device 0 `GeForce GTX 1660' supports image sizes
> > of 32768 x 32768
> >
> > 0.147663 [opencl_init] device 0 `GeForce GTX 1660' allows GPU memory
> > allocations of up to 1485MB
> >
> > [opencl_init] device 0: GeForce GTX 1660
> >       GLOBAL_MEM_SIZE:          5942MB
> >       MAX_WORK_GROUP_SIZE:      1024
> >       MAX_WORK_ITEM_DIMENSIONS: 3
> >       MAX_WORK_ITEM_SIZES:      [ 1024 1024 64 ]
> >       DRIVER_VERSION:           435.21
> >       DEVICE_VERSION:           OpenCL 1.2 CUDA
> > ...
> >
> > 5,678742 [opencl_profiling] spent  2,4782 seconds totally in command
> > queue (with 0 events missing)
> >
> > 5,678749 [dev_process_export] pixel pipeline processing took 3,343 secs
> > (8,930 CPU)
> >
> > [export_job] exported to `bench_02.jpg'
> >
> > 6,009675 [opencl_summary_statistics] device 'GeForce GTX 1660' (0): 551
> > out of 551 events were successful and 0 events lost
> >
> > Here is the result of the cpu only, I use a Ryzen 3700X:
> >
> > 18,169530 [dev_process_export] pixel pipeline processing took 16,369
> > secs (230,259 CPU)
> >
> > So using the GPU is about 5 times faster for me.
> >
> > Read for more tests:
> > https://math.dartmouth.edu/~sarunas/darktable_bench.html
> >
> > What I found out, that it is getting a lot more expensive, if you
> > want significant more speed, but I cannot give you an advice what do
> > use, if you want to spend less money. Ask always for people who use
> > this card really and don't talk about theory. There are a lot of
> > details. It didn't work for me out of the box, I had to install some
> > packages, which I had to guess. You have always to check which
> > operating system / graphics driver is needed. In my case I had to use
> > Ubuntu 19.10. Note this distro uses a very outdated exiv2-version as
> > discussed in another thread here.
> >
> >
> > Al
> >
> ____________________________________________________________________________
> > darktable user mailing list
> > to unsubscribe send a mail to
> darktable-user+unsubscr...@lists.darktable.org
> >
>
> ____________________________________________________________________________
> darktable user mailing list
> to unsubscribe send a mail to
> darktable-user+unsubscr...@lists.darktable.org
>
>

____________________________________________________________________________
darktable user mailing list
to unsubscribe send a mail to darktable-user+unsubscr...@lists.darktable.org

Reply via email to