[Gegl-developer] VIPS and GEGL performance and memory usage comparison

2016-01-28 Thread Sven Claussner

Hi,

the developers of VIPS/libvips, a batch image-processing library,
have a performance and memory usage comparison on their website,
including a GEGL test. [1]
Some days ago I told John Cupitt, the maintainer there, some issues
with the reported GEGL tests.
In his answer to me John points out that GEGL is a bit odd in this
comparison, because it is the only interactive image processing library
there. He therefore suggests to remove GEGL from this list.

What do you GEGL developers think - does anybody need these results so
GEGL should reside in this comparison or would it be OK, if John
removed it from the list?

Greetings

Sven

[1]
http://www.vips.ecs.soton.ac.uk/index.php?title=Speed_and_Memory_Use


___
gegl-developer-list mailing list
List address:gegl-developer-list@gnome.org
List membership: https://mail.gnome.org/mailman/listinfo/gegl-developer-list



Re: [Gegl-developer] VIPS and GEGL performance and memory usage comparison

2016-01-28 Thread Daniel Rogers
Hi Sven,

I am confused.  What technical reason exists to assume gegl cannot be as
fast as vips? Is it memory usage? Extra necessary calculations? Some way in
which parallelism is not as possible?

--
Daniel
On Jan 28, 2016 12:58 PM, "Sven Claussner"  wrote:

> Hi,
>
> the developers of VIPS/libvips, a batch image-processing library,
> have a performance and memory usage comparison on their website,
> including a GEGL test. [1]
> Some days ago I told John Cupitt, the maintainer there, some issues
> with the reported GEGL tests.
> In his answer to me John points out that GEGL is a bit odd in this
> comparison, because it is the only interactive image processing library
> there. He therefore suggests to remove GEGL from this list.
>
> What do you GEGL developers think - does anybody need these results so
> GEGL should reside in this comparison or would it be OK, if John
> removed it from the list?
>
> Greetings
>
> Sven
>
> [1]
> http://www.vips.ecs.soton.ac.uk/index.php?title=Speed_and_Memory_Use
>
>
> ___
> gegl-developer-list mailing list
> List address:gegl-developer-list@gnome.org
> List membership:
> https://mail.gnome.org/mailman/listinfo/gegl-developer-list
>
>
___
gegl-developer-list mailing list
List address:gegl-developer-list@gnome.org
List membership: https://mail.gnome.org/mailman/listinfo/gegl-developer-list



Re: [Gegl-developer] VIPS and GEGL performance and memory usage comparison

2016-01-28 Thread Alexandre Prokoudine
On Thu, Jan 28, 2016 at 11:58 PM, Sven Claussner wrote:
> Hi,
>
> the developers of VIPS/libvips, a batch image-processing library,
> have a performance and memory usage comparison on their website,
> including a GEGL test. [1]
> Some days ago I told John Cupitt, the maintainer there, some issues
> with the reported GEGL tests.
> In his answer to me John points out that GEGL is a bit odd in this
> comparison, because it is the only interactive image processing library
> there. He therefore suggests to remove GEGL from this list.

Which, of course, might remind you https://github.com/jcupitt/gegl-vips :)

Alex
___
gegl-developer-list mailing list
List address:gegl-developer-list@gnome.org
List membership: https://mail.gnome.org/mailman/listinfo/gegl-developer-list



Re: [Gegl-developer] VIPS and GEGL performance and memory usage comparison

2016-01-28 Thread Sven Claussner

On 29.01.16 at 01:06 AM Alexandre Prokoudine wrote:
> Which, of course, might remind you 
https://github.com/jcupitt/gegl-vips :)



Thanks, Alex, for your reminder! I remembered that project and it was
discussed sometimes in the years before. The last status (2013) was that
libvips could basically be used as GEGL back-end, but still needed
area invalidation (see John's post from 10.11.13 on this list). At this
time John said it could be in one or two years to be implemented if
nobody volunteered for this job.
Looking at the tremendously higher (batch processing) performance of
VIPS compared to GEGL (with tile back-end I assume) I personally would
appreciate a VIPS back-end very much.

Greetings

Sven

--
___
gegl-developer-list mailing list
List address:gegl-developer-list@gnome.org
List membership: https://mail.gnome.org/mailman/listinfo/gegl-developer-list



Re: [Gegl-developer] VIPS and GEGL performance and memory usage comparison

2016-01-28 Thread Sven Claussner

On  28.1.2016 at 10:29 PM Daniel Rogers wrote:
> Hi Sven,
>
> I am confused.  What technical reason exists to assume gegl cannot be as
> fast as vips? Is it memory usage? Extra necessary calculations? Some way
> in which parallelism is not as possible?


Hi Daniel,

you might have misunderstood me. The performance comparison only shows
that VIPS outperforms GEGL at least in this test.
Technical reasons can be found here:
http://www.vips.ecs.soton.ac.uk/index.php?title=Speed_and_Memory_Use

In a mail John explained the differences to me:
"Gegl is really targeting interactive applications, not batch
processing, and it's doing a lot of work that no one else is doing,
like conversion to scRGB, transparency, caching, and so on."

I didn't claim that GEGL couldn't be as fast as VIPS. It might be
much faster as now by using VIPS as library. This is why there is
gegl-vips, a VIPS-based GEGL back-end.

You'll find some more information when digging this list for VIPS,
mails from John Cupitt and Nicolas Robidoux or GEGL's performance
in general.

Greetings

Sven


___
gegl-developer-list mailing list
List address:gegl-developer-list@gnome.org
List membership: https://mail.gnome.org/mailman/listinfo/gegl-developer-list



Re: [Gegl-developer] VIPS and GEGL performance and memory usage comparison

2016-01-29 Thread Øyvind Kolås
On Fri, Jan 29, 2016 at 5:41 AM, Sven Claussner  wrote:
> On  28.1.2016 at 10:29 PM Daniel Rogers wrote:
>> I am confused.  What technical reason exists to assume gegl cannot be as
>> fast as vips? Is it memory usage? Extra necessary calculations? Some way
>> in which parallelism is not as possible?
>
> you might have misunderstood me. The performance comparison only shows
> that VIPS outperforms GEGL at least in this test.
> Technical reasons can be found here:
> http://www.vips.ecs.soton.ac.uk/index.php?title=Speed_and_Memory_Use
>
> In a mail John explained the differences to me:
> "Gegl is really targeting interactive applications, not batch
> processing, and it's doing a lot of work that no one else is doing,
> like conversion to scRGB, transparency, caching, and so on."

GEGL is doing single precision 32bit floating point processing for all
operations, thus should not introduce the type of quantization
problems 8bpc/16bpc pipelines introduce for multiple filters - at the
expense of much higher memory bandwidth - the GEGL tile cache size
(and swap backend) should be tuned if doing benchmarks. If this
benchmark is similar to one done years ago, VIPS was being tested with
a hard-coded 8bpc 3x3 sharpening filter while GEGL was rigged up to
use a composite meta operation pipeline based unsharp mask using
gaussian blur and compositing filters in floating point. These factors
are probably more a cause of slow-down than the startup time loading
all the plug-in shared objects, which still takes more than a second
on my machine per started GEGL process.

/pippin
___
gegl-developer-list mailing list
List address:gegl-developer-list@gnome.org
List membership: https://mail.gnome.org/mailman/listinfo/gegl-developer-list



Re: [Gegl-developer] VIPS and GEGL performance and memory usage comparison

2016-01-29 Thread Daniel Rogers
On Jan 29, 2016 6:20 AM, "Øyvind Kolås"  wrote:
>
> GEGL is doing single precision 32bit floating point processing for all
> operations, thus should not introduce the type of quantization
> problems 8bpc/16bpc pipelines introduce for multiple filters - at the
> expense of much higher memory bandwidth - the GEGL tile cache size
> (and swap backend) should be tuned if doing benchmarks. If this
> benchmark is similar to one done years ago, VIPS was being tested with
> a hard-coded 8bpc 3x3 sharpening filter while GEGL was rigged up to
> use a composite meta operation pipeline based unsharp mask using
> gaussian blur and compositing filters in floating point. These factors
> are probably more a cause of slow-down than the startup time loading
> all the plug-in shared objects, which still takes more than a second
> on my machine per started GEGL process.

Ah so this is interesting. So I feel like rather than removing gegl from
that list of benchmarks, it would be better to build more benchmarks,
especially ones that call out all the advantages of gegl. E.g. minimal
updates, deep pipeline accuracy, etc.

It is worth calling out gegls limitations and being honest with them for
three reasons.  First, they are not fundamental to the design of gegl. Just
having a vips backend proves that. Second, a lot of the tricks vips does,
gegl really can learn from, and having benchmarks that do not look so good
is a great way to call out opportunities for improvement. And third,
benchmarks help users make good decisions about whether gegl is a good fit
for their needs. Transparency is one of the deeply valuable benefits of
open source.

In terms of technical projects I feel having this benchmark and the
discussion about it inspires:

   - Gegl could load plugins in a more demand driven way, reducing startup
   costs.
   - Gegl could have multiple pipelines optimized for different use cases.
   - A fast 8 bit pipeline is great for previews or single operation
   stacks, or when accuracy is not as important for the user.
   - Better threading, including better I/O pipelining is a great idea to
   lift from vips.
   - Anyone can do dynamic compilation nowadays with llvm.  Imagine taking
   the gegl dynamic tree, and compiling it into a single LLVM dynamically
   compiled function.

So if any of the above actually appear in patch sets, then we, at least
partially, have this benchmark to thank for motivating that.  I can see
ways in which any one of the above projects can benefit GIMP as well. And
in terms of transparency and user benefit, , the vips developers' benchmark
also makes me think that there really should be a set of benchmarks that
call out the concrete user benefits for gegl.  E.g. higher accuracy,
especially for deep pipelines.  If these benefits exist it must be possible
to measure them, and show how gegl truly beats out everyone else it it's
areas of focus.  In a very reals sense, vips is doing exactly what they
should be.  They are saying "if speed for a single image one-and-done
operation is what you need vips is your tool, and gegl really isn't."  That
sounds like an extremely fair statement to me right now, until some of
gegls limitations in this area are addressed.  And long term, why not?

--
Daniel
___
gegl-developer-list mailing list
List address:gegl-developer-list@gnome.org
List membership: https://mail.gnome.org/mailman/listinfo/gegl-developer-list



Re: [Gegl-developer] VIPS and GEGL performance and memory usage comparison

2016-01-29 Thread jcupitt
Hello all, vips maintainer here, thank you for this interesting discussion.

On 29 January 2016 at 16:37, Daniel Rogers  wrote:
> A fast 8 bit pipeline is great for previews or single operation stacks, or
> when accuracy is not as important for the user.

My feeling is that gegl is probably right to be float-only, the cost
is surprisingly low on modern machines. On my laptop, for that
benchmark in 8-bit I see:

  $ time ./vips8.py tmp/x.tif tmp/x2.tif
  real0m0.504s
  user0m1.548s
  sys0m0.104s

If I add "cast(float)" just after the load, and "cast(uchar)" just
before the write, the whole thing runs as float and I see:

  $ time ./vips8.py tmp/x.tif tmp/x2.tif
  real0m0.578s
  user0m1.768s
  sys0m0.148s

Plus float-only makes an opencl path much simpler.

As you say, this tiny benchmark is very focused on batch performance,
so fast startup / shutdown and lots of file IO. It's not what gegl is
generally used for.

John
___
gegl-developer-list mailing list
List address:gegl-developer-list@gnome.org
List membership: https://mail.gnome.org/mailman/listinfo/gegl-developer-list



Re: [Gegl-developer] VIPS and GEGL performance and memory usage comparison

2016-01-29 Thread Adam Bavier
As someone new to the gegl development list and seeing the performance
numbers in that benchmark, I propose adding a asterisk * by each gegl
number would help the reader understand that something is different with
this library.  Then add the corresponding asterisk down by the statement, "GEGL
is not really designed for batch-style processing -- it targets interactive
applications, like paint programs."  Since gegl is the only interactive
library in the list the asterisk works well enough and separating it out to
a different table is not necessary.

Best regards,
-Adam Bavier

On Thu, Jan 28, 2016 at 2:58 PM, Sven Claussner  wrote:

> Hi,
>
> the developers of VIPS/libvips, a batch image-processing library,
> have a performance and memory usage comparison on their website,
> including a GEGL test. [1]
> Some days ago I told John Cupitt, the maintainer there, some issues
> with the reported GEGL tests.
> In his answer to me John points out that GEGL is a bit odd in this
> comparison, because it is the only interactive image processing library
> there. He therefore suggests to remove GEGL from this list.
>
> What do you GEGL developers think - does anybody need these results so
> GEGL should reside in this comparison or would it be OK, if John
> removed it from the list?
>
> Greetings
>
> Sven
>
> [1]
> http://www.vips.ecs.soton.ac.uk/index.php?title=Speed_and_Memory_Use
>
>
> ___
> gegl-developer-list mailing list
> List address:gegl-developer-list@gnome.org
> List membership:
> https://mail.gnome.org/mailman/listinfo/gegl-developer-list
>
>
___
gegl-developer-list mailing list
List address:gegl-developer-list@gnome.org
List membership: https://mail.gnome.org/mailman/listinfo/gegl-developer-list



Re: [Gegl-developer] VIPS and GEGL performance and memory usage comparison

2016-02-01 Thread Sven Claussner

Hi Daniel,

thanks for sharing your thoughts. I agree with you in many points.

On  29.1.2016 at 5:37 PM Daniel Rogers wrote:

  * Anyone can do dynamic compilation nowadays with llvm.  Imagine
taking the gegl dynamic tree, and compiling it into a single LLVM
dynamically compiled function.


What exactly do you mean? How is this supposed to work and where is the
performance advantage if done at runtime?

Greetings

Sven

___
gegl-developer-list mailing list
List address:gegl-developer-list@gnome.org
List membership: https://mail.gnome.org/mailman/listinfo/gegl-developer-list



Re: [Gegl-developer] VIPS and GEGL performance and memory usage comparison

2016-02-01 Thread Daniel Rogers
On Feb 1, 2016 12:40 PM, "Sven Claussner"  wrote:
> On  29.1.2016 at 5:37 PM Daniel Rogers wrote:
>>
>>   * Anyone can do dynamic compilation nowadays with llvm.  Imagine
>>
>> taking the gegl dynamic tree, and compiling it into a single LLVM
>> dynamically compiled function.
>
>
> What exactly do you mean? How is this supposed to work and where is the
> performance advantage if done at runtime?

To your first question. I made that statement as a counterpoint to vips
turning a convolution kernel into a set of sse3 instructions and executing
them.

I believe, though haven't proven rigorously, that a gegl graph is
homomorphic to a parse tree of an expression language over images.

In other words, there exists an abstract language for which gegl is the
parse tree.

For example:
a = load(path1)
b = load(path2)
c = load(path3)
out = a * b + c
write(out)

Given suitable types, and suitable definitions for *, =, and +, there is a
gegl graph which exactly describes that program above. (For the record, I
believe the language would have to be single assignment and lazy evaluated
in order to be homomorphic to the DAG of gegl).

If that is the case, you can turn the argument on its head and say that
gegl is just an intermediate representation of a compiled language. This
makes the gegl library itself an interpreter of the IR.

Given these equivalencies, you can reasonably ask, can we use a different
IR? Can we transform one IR to another? Can we use a different interpreter?
The answer to all of these is yes, trivially.

So. Can we transform a gegl graph to a llvm IR? Can we then pass that LLVM
IR to llvm to produce the machine code equivalent of our gegl graph?

If we did that, then all of the llvm optimization machinery comes for free.
So I would reasonably expect llvm to merge operations into single loops,
combine similar operations, reduce the overall instruction count, and
inline lots of code, reduce indirection, loop unroll, etc. Llvm has quite a
few optimization passes.

To your second question: the gegl tree is executed a lot. At least once for
every tile in the output. This is especially true if gegl is used
interactively and the same tree is evaluated thousands or millions of time
with different inputs. Thus you would be trading an upfront cost of
building the compiled graph with reduced runtime per tile, and reduced
total runtime.

There are potentially more conservative approaches here that turn a gegl
tree into a set of byte codes, and refactoring large chunks of gegl into a
bytecode interpreter.

A really interesting follow up is just what other kinds of IR and runtimes
can we use? Gegl to jvm bytecode? Gegl to Cg? Gegl to an asic? (FPGA, DSP,
etc)?

--
Daniel
___
gegl-developer-list mailing list
List address:gegl-developer-list@gnome.org
List membership: https://mail.gnome.org/mailman/listinfo/gegl-developer-list



Re: [Gegl-developer] VIPS and GEGL performance and memory usage comparison

2016-02-02 Thread Øyvind Kolås
On Mon, Feb 1, 2016 at 10:35 PM, Daniel Rogers  wrote:
>>>   * Anyone can do dynamic compilation nowadays with llvm.  Imagine
>>>
>>> taking the gegl dynamic tree, and compiling it into a single LLVM
>>> dynamically compiled function.
>>
>> What exactly do you mean? How is this supposed to work and where is the
>> performance advantage if done at runtime?
>
> To your first question. I made that statement as a counterpoint to vips
> turning a convolution kernel into a set of sse3 instructions and executing
> them.
>
> I believe, though haven't proven rigorously, that a gegl graph is
> homomorphic to a parse tree of an expression language over images.

For a subset of operation this might work, but not for generic ops -
that possibly use shared libraries rather than arithmetic in their
implementation, an approach that might work out for some of the subset
and permit reusing existing infrastructure in GEGL is to recombine the
cores of OpenCL point filters/composers and submit one image
processing kernel for OpenCL compilation - which for many(/most?)
OpenCL implementations would end up using LLVM in the background.

This is however different from my complaint of the benchmark
comparison - where VIPS is using a 3x3 convolution kernel, and the
GEGL code uses gegl:unsharp-mask which is : gegl:gaussian blur + a
point composer .. which in turn is a horizontal blur, and a vertical
blur + a point composer. Comparing a 3x3 area op with a composite much
more general purpose sharpening filter that can do (and already for
the parameters provided) would do larger input area as well as by its
nature have more temporary buffers is not a proper apples to oranges
comparison. Adding a gegl:3x3-convolution (or adapting
gegl:convolution-matrix to detect the extent of the kernel) might make
GEGL perform closer to VIPS on this benchmark which caters well to
VIPS features. I do however not think we should add "hard-coded" 3x3
sharpen/blur ops in GEGL.

/pippin
___
gegl-developer-list mailing list
List address:gegl-developer-list@gnome.org
List membership: https://mail.gnome.org/mailman/listinfo/gegl-developer-list



Re: [Gegl-developer] VIPS and GEGL performance and memory usage comparison

2016-02-03 Thread Sven Claussner

Hi,

these are interesting thoughts and I'm glad to read about mathematics 
and computer science that goes beyond the day-to-day topics.


On  1.2.2016 at 10:35 PM Daniel Rogers wrote:



Given these equivalencies, you can reasonably ask, can we use a
different IR? Can we transform one IR to another? Can we use a different
interpreter? The answer to all of these is yes, trivially.

So. Can we transform a gegl graph to a llvm IR? Can we then pass that
LLVM IR to llvm to produce the machine code equivalent of our gegl graph?


To my understanding much of this is already covered by the existing
GEGL graph language to C/OpenCL to binary code transformations.
One interesting application of IR transformation would IMHO be graph
optimization the same way query execution graphs in database systems are
optimized. So, if we have a graph of many GEGL ops the commutative ops
could be reordered to increase computation speed:
(Color Tool, Crop) --> (Crop, Color Tool)

Equivalent or inverse ops could be merged into one single op:
(Brightness + 10, Brightness -2) --> Brightness +8
or other ops working on a convolution matrix.

This might not work for all ops, e.g.
(Blur, Crop) --> (Crop, Blur)
would not work, because the Blur op processes adjacent pixels which get
lost by cropping (if we don't leave the Blur radius as extra border for
cropping).



To your second question: the gegl tree is executed a lot. At least once
for every tile in the output. This is especially true if gegl is used
interactively and the same tree is evaluated thousands or millions of
time with different inputs. Thus you would be trading an upfront cost of
building the compiled graph with reduced runtime per tile, and reduced
total runtime.

OK, I agree here.


A really interesting follow up is just what other kinds of IR and
runtimes can we use? Gegl to jvm bytecode? Gegl to Cg? Gegl to an asic?
(FPGA, DSP, etc)?

Interesting brainstorming thoughts, too. Considering them I think the
OpenCL approach is the most flexible one, that could possibly also be
used to access DSP's computing power.

Greetings

Sven





___
gegl-developer-list mailing list
List address:gegl-developer-list@gnome.org
List membership: https://mail.gnome.org/mailman/listinfo/gegl-developer-list



Re: [Gegl-developer] VIPS and GEGL performance and memory usage comparison

2016-02-06 Thread jcupitt
On 2 February 2016 at 12:37, Øyvind Kolås  wrote:
> comparison. Adding a gegl:3x3-convolution (or adapting
> gegl:convolution-matrix to detect the extent of the kernel) might make
> GEGL perform closer to VIPS on this benchmark which caters well to
> VIPS features. I do however not think we should add "hard-coded" 3x3
> sharpen/blur ops in GEGL.

I agree, it's not very fair. I tried with gegl:convolution-matrix, but
it was a lot slower, I'm not sure why.

I realized that the tiff writer is writing float scRGBA, which is also
not very fair. Is there a simple way to make it write a lower
bit-depth image? Sorry for the stupid question.

John
___
gegl-developer-list mailing list
List address:gegl-developer-list@gnome.org
List membership: https://mail.gnome.org/mailman/listinfo/gegl-developer-list



Re: [Gegl-developer] VIPS and GEGL performance and memory usage comparison

2016-02-06 Thread jcupitt
On 29 January 2016 at 19:07, Adam Bavier  wrote:
> numbers in that benchmark, I propose adding a asterisk * by each gegl number
> would help the reader understand that something is different with this
> library.  Then add the corresponding asterisk down by the statement, "GEGL
> is not really designed for batch-style processing -- it targets interactive
> applications, like paint programs."  Since gegl is the only interactive

That's a good idea, thank you. I've updated the page with numbered
notes by some results to make the qualifications easier to find.

John
___
gegl-developer-list mailing list
List address:gegl-developer-list@gnome.org
List membership: https://mail.gnome.org/mailman/listinfo/gegl-developer-list