Re: [Gegl-developer] VIPS and GEGL performance and memory usage comparison

2016-02-01 Thread Daniel Rogers
On Feb 1, 2016 12:40 PM, "Sven Claussner"  wrote:
> On  29.1.2016 at 5:37 PM Daniel Rogers wrote:
>>
>>   * Anyone can do dynamic compilation nowadays with llvm.  Imagine
>>
>> taking the gegl dynamic tree, and compiling it into a single LLVM
>> dynamically compiled function.
>
>
> What exactly do you mean? How is this supposed to work and where is the
> performance advantage if done at runtime?

To your first question. I made that statement as a counterpoint to vips
turning a convolution kernel into a set of sse3 instructions and executing
them.

I believe, though haven't proven rigorously, that a gegl graph is
homomorphic to a parse tree of an expression language over images.

In other words, there exists an abstract language for which gegl is the
parse tree.

For example:
a = load(path1)
b = load(path2)
c = load(path3)
out = a * b + c
write(out)

Given suitable types, and suitable definitions for *, =, and +, there is a
gegl graph which exactly describes that program above. (For the record, I
believe the language would have to be single assignment and lazy evaluated
in order to be homomorphic to the DAG of gegl).

If that is the case, you can turn the argument on its head and say that
gegl is just an intermediate representation of a compiled language. This
makes the gegl library itself an interpreter of the IR.

Given these equivalencies, you can reasonably ask, can we use a different
IR? Can we transform one IR to another? Can we use a different interpreter?
The answer to all of these is yes, trivially.

So. Can we transform a gegl graph to a llvm IR? Can we then pass that LLVM
IR to llvm to produce the machine code equivalent of our gegl graph?

If we did that, then all of the llvm optimization machinery comes for free.
So I would reasonably expect llvm to merge operations into single loops,
combine similar operations, reduce the overall instruction count, and
inline lots of code, reduce indirection, loop unroll, etc. Llvm has quite a
few optimization passes.

To your second question: the gegl tree is executed a lot. At least once for
every tile in the output. This is especially true if gegl is used
interactively and the same tree is evaluated thousands or millions of time
with different inputs. Thus you would be trading an upfront cost of
building the compiled graph with reduced runtime per tile, and reduced
total runtime.

There are potentially more conservative approaches here that turn a gegl
tree into a set of byte codes, and refactoring large chunks of gegl into a
bytecode interpreter.

A really interesting follow up is just what other kinds of IR and runtimes
can we use? Gegl to jvm bytecode? Gegl to Cg? Gegl to an asic? (FPGA, DSP,
etc)?

--
Daniel
___
gegl-developer-list mailing list
List address:gegl-developer-list@gnome.org
List membership: https://mail.gnome.org/mailman/listinfo/gegl-developer-list



Re: [Gegl-developer] VIPS and GEGL performance and memory usage comparison

2016-01-29 Thread Daniel Rogers
On Jan 29, 2016 6:20 AM, "Øyvind Kolås"  wrote:
>
> GEGL is doing single precision 32bit floating point processing for all
> operations, thus should not introduce the type of quantization
> problems 8bpc/16bpc pipelines introduce for multiple filters - at the
> expense of much higher memory bandwidth - the GEGL tile cache size
> (and swap backend) should be tuned if doing benchmarks. If this
> benchmark is similar to one done years ago, VIPS was being tested with
> a hard-coded 8bpc 3x3 sharpening filter while GEGL was rigged up to
> use a composite meta operation pipeline based unsharp mask using
> gaussian blur and compositing filters in floating point. These factors
> are probably more a cause of slow-down than the startup time loading
> all the plug-in shared objects, which still takes more than a second
> on my machine per started GEGL process.

Ah so this is interesting. So I feel like rather than removing gegl from
that list of benchmarks, it would be better to build more benchmarks,
especially ones that call out all the advantages of gegl. E.g. minimal
updates, deep pipeline accuracy, etc.

It is worth calling out gegls limitations and being honest with them for
three reasons.  First, they are not fundamental to the design of gegl. Just
having a vips backend proves that. Second, a lot of the tricks vips does,
gegl really can learn from, and having benchmarks that do not look so good
is a great way to call out opportunities for improvement. And third,
benchmarks help users make good decisions about whether gegl is a good fit
for their needs. Transparency is one of the deeply valuable benefits of
open source.

In terms of technical projects I feel having this benchmark and the
discussion about it inspires:

   - Gegl could load plugins in a more demand driven way, reducing startup
   costs.
   - Gegl could have multiple pipelines optimized for different use cases.
   - A fast 8 bit pipeline is great for previews or single operation
   stacks, or when accuracy is not as important for the user.
   - Better threading, including better I/O pipelining is a great idea to
   lift from vips.
   - Anyone can do dynamic compilation nowadays with llvm.  Imagine taking
   the gegl dynamic tree, and compiling it into a single LLVM dynamically
   compiled function.

So if any of the above actually appear in patch sets, then we, at least
partially, have this benchmark to thank for motivating that.  I can see
ways in which any one of the above projects can benefit GIMP as well. And
in terms of transparency and user benefit, , the vips developers' benchmark
also makes me think that there really should be a set of benchmarks that
call out the concrete user benefits for gegl.  E.g. higher accuracy,
especially for deep pipelines.  If these benefits exist it must be possible
to measure them, and show how gegl truly beats out everyone else it it's
areas of focus.  In a very reals sense, vips is doing exactly what they
should be.  They are saying "if speed for a single image one-and-done
operation is what you need vips is your tool, and gegl really isn't."  That
sounds like an extremely fair statement to me right now, until some of
gegls limitations in this area are addressed.  And long term, why not?

--
Daniel
___
gegl-developer-list mailing list
List address:gegl-developer-list@gnome.org
List membership: https://mail.gnome.org/mailman/listinfo/gegl-developer-list



Re: [Gegl-developer] VIPS and GEGL performance and memory usage comparison

2016-01-28 Thread Daniel Rogers
Hi Sven,

I am confused.  What technical reason exists to assume gegl cannot be as
fast as vips? Is it memory usage? Extra necessary calculations? Some way in
which parallelism is not as possible?

--
Daniel
On Jan 28, 2016 12:58 PM, "Sven Claussner"  wrote:

> Hi,
>
> the developers of VIPS/libvips, a batch image-processing library,
> have a performance and memory usage comparison on their website,
> including a GEGL test. [1]
> Some days ago I told John Cupitt, the maintainer there, some issues
> with the reported GEGL tests.
> In his answer to me John points out that GEGL is a bit odd in this
> comparison, because it is the only interactive image processing library
> there. He therefore suggests to remove GEGL from this list.
>
> What do you GEGL developers think - does anybody need these results so
> GEGL should reside in this comparison or would it be OK, if John
> removed it from the list?
>
> Greetings
>
> Sven
>
> [1]
> http://www.vips.ecs.soton.ac.uk/index.php?title=Speed_and_Memory_Use
>
>
> ___
> gegl-developer-list mailing list
> List address:gegl-developer-list@gnome.org
> List membership:
> https://mail.gnome.org/mailman/listinfo/gegl-developer-list
>
>
___
gegl-developer-list mailing list
List address:gegl-developer-list@gnome.org
List membership: https://mail.gnome.org/mailman/listinfo/gegl-developer-list



Re: [Gegl-developer] New serialization format

2012-07-09 Thread Daniel Rogers

On Jul 5, 2012, at 7:57 AM, Michael Natterer wrote:

> And XML was ruled out because it's not the latest fad any longer?

I think this is pretty much the right answer.  There is a ton of XML hate in 
the world right now.

Having fought this battle when dealing with millions of lines of code, 100's of 
thousands of lines of JSON and/or XML, I can leave the following advice…

XML is probably the right answer here.  XML sucks in the following ways.

1. It's verbose.  This is actually good for humans, but it sucks as a wire 
format, and some people feel the verbosity is unreadable.  That's only true if 
you're able to keep all the context in your head.  Once someone screws up the 
indentation, or you're 1000 lines in and 12 nested levels deep, having the 
extra context of tag names makes a huge difference.  Also, gzip is awesome here 
and solves the on-disk space issues.

2. It's complex.  No argument here.  There is a lot of things is supposed to 
do, and a major ambiguity that people always complain about (attribute vs. 
elements).

3. Many of the parsers are memory hogs (tree parsers) or very slow (though 
that's gotten much better and doesn't apply to the parser gegl is using).  They 
were copying too many strings.

1 and 3 means it sucks as an on-wire format for interactive HTTP requests 
(though gzip pretty much negates 1).  2 means it's hard to write a fast JS 
parser for it, which means your HTML5 app will get slow.

Everyone says "it's more readable!" Then they try to maintain a large file, 
using their JSON file.  Then they discover that validation and line numbers for 
errors, and a more expressive grammar go a long way towards keeping programs 
simpler.  The first time you spend an hour trying to track down where your 
missing "," caused your entire file to fail to parse, you'll wish you had a 
better parser.  I haven't found a JSON parser that will actually spit out line 
numbers and context for errors.  With XML, it's easy to combine multiple 
grammars (think embedding GEGL ops into another XML document).  It has a 
validation language (two of them, in fact. yes, they have warts… but they do 
actually work for most things).  It's easier for new brains to look at (though 
slower for familiar brains).  It's more self-describing, for those who expect 
their file format to be produced or consumed by many other programs.  It's 
amazing how important strict specification can be when it comes to using a file 
as an interchange format.  XML is much better at this, than most other options.

Anyways, if you just expect your serialization to be temporary (like a wire 
format), needs to be parsed fast by a huge variety of hardware in languages 
without a byte array (JS), or is only produced and consumed by your own 
application, then JSON (or BSON, or protocol buffers) seem like a good choice.  
If you're going for more of an interchange format, stick with XML.

Thus I would strongly suggest using XML for this.

Also, as far as structure goes, if you want to represent a general graph, you 
can draw inspiration from DOT, the language of graphviz.  There is also 
graphML.  You could frankly use graphML straight out of the box, though it has 
lots of features you're probably not interested in.

The general structure is usually:


  .. graph attributes …
  
  
' 
  
  
  


So you don't try to put a tree in the text at all.  IT's just a list of nodes 
and edges.

--
Daniel

___
gegl-developer-list mailing list
gegl-developer-list@gnome.org
https://mail.gnome.org/mailman/listinfo/gegl-developer-list