On Sun, Apr 17, 2011 at 9:40 PM,  <jcup...@gmail.com> wrote:
> On 17 April 2011 14:24, Øyvind Kolås <pip...@gimp.org> wrote:
>> On my c2d 1.86ghz laptop I get 105s real 41s user with default settings.
>> Setting GEGL_SWAP=RAM in the environment to turn off the disk swapping
>> of tiles makes it run in 43s real 41s user.
>
> I found GEGL_SWAP=RAM, but on my laptop the process wandered off into
> swap death before finishing. Is there some way to limit mem use? I
> only have 2gb.

My laptop has 3gb of RAM and thus doesn't end up crunching swap on such a test.

Setting GEGL_CACHE_SIZE=1300 or so, should have a similar effect,
hopefully GEGL wouldn't need to make everying swap. (not that in doing
so you should _not_ set GEGL_SWAP=RAM). I noticed that setting
GEGL_THREADS=anything_more_than_1 causes things to crash, along with
other things that more subtly break.. are the reason GEGL doesnt
default to keep all cores busy yet.

>> Loading a png into a tiled buffer as used by GeglBuffer is kind of
>> bound to be slow, at the moment GEGL doesnt have a native TIFF loader,
>
> You can work with tiled tiff straight from the file, but for sadly for
> striped tiff (as 90%+ are, groan) you have to unpack the whole file
> first :-(

I'm not sure what a striped tiff is, if it stores each scanline
separately GeglBuffer could be able to load data directly from it by
using 1px high tiles that are as wide as the image.

>>> babl converts to linear float and back with exp() and log(). Using
>>> lookup tables instead saves 12s.
>>
>> If the original PNG was 8bit, babl should have a valid fast path for
>> using lookup tables converting it to 32bit linear. For most other
>
> OK, interesting, I shall look at the callgrind output again.

I'd recommend setting the BABL_TOLERANCE=0.004 environment variable as
well, to permit some fast paths with errors around or below 1.0/256
avoiding the rather computationally intensive synthetic reference
conversion code in babl.

>>> The gegl unsharp operator is implemented as gblur/sub/mul/add. These
>>> are all linear operations, so you can fold the maths into a single
>>> convolution. Redoing unsharp as a separable convolution saves 1s.
>>
>> For smaller radiuses this is fine, for larger ones it is not, ideally
>> GEGL would be doing what is optimal behind the users back.
>
> Actually, it works for large radius as well. By separable convolution
> I mean doing a 1xn pass then a nx1 pass. You can "bake" the
> sub/mul/add into the coefficients you calculate in gblur.

I thought you meant hard-coded convultions similar to the
crop-and-sharpen example, baking it into the convolution might be
beneficial, though at the moment I see it as more important to make
sure gaussian blur is as fast as possible since it is a primitive that
both this, and dropshadow and other commonly employed compositing
things are built from.

/Øyvind K.
-- 
«The future is already here. It's just not very evenly distributed»
                                                 -- William Gibson
http://pippin.gimp.org/                            http://ffii.org/
_______________________________________________
Gegl-developer mailing list
Gegl-developer@lists.XCF.Berkeley.EDU
https://lists.XCF.Berkeley.EDU/mailman/listinfo/gegl-developer

Reply via email to