On Fri, Jan 21, 2011 at 11:56 PM, Soeren Sandmann <sandm...@cs.au.dk> wrote: > Andrea Canciani <ranm...@gmail.com> writes: > >> > The old code had more lines of code for this part, but on the other hand >> > all >> > this code was in a single place as opposed to being spread around. In some >> > ways, the new code is a bit more limiting. For example, fetching directly >> > to >> > destination without intermediate buffers for SRC operator might be more >> > tricky to implement. >> >> I tried to implement this and it looks much easier than I had expected. >> To support what you're suggesting I need to pass a buffer for each iteration >> of the source, so I'm changing the interface a little: >> - get_scanline receives an iterator, a buffer pointer and a mask pointer >> - the iterator initializers return the flag mask they respected. They >> are allowed >> to ignore some of the flags (on a per-flag basis, we can still specify >> that >> some other flags have to always be respected). >> - I add a new flag ITER_ON_DEST, If this flag is not set, get_scanline can >> use >> any destination buffer as long as it returns a pointer to it; if >> this flag is set, >> it *must* write the data on it (and return a pointer to it as well) >> - The various src_iterators now return the normal flags unless they >> are get_noop >> (or direct bits access, but we never do that) in which case we leave the >> memcpy'ing work to the compositing function by clearing the ON_DEST flag. >> - The caller of a get_scanline function should always pass iter->buffer as >> the buffer unless the ON_DEST flag is enabled. In this case it can pass >> any >> destination buffer. >> >> I got this far: >> http://cgit.freedesktop.org/~ranma42/pixman/commit/?h=wip/itersource&id=bfc9fa36b9767656d5686b18b49fa0d8c01af750 > > The interface described above seems too complicated given that it is > only intended to support this particular SRC optimization. There is a > lot of things to remember by both caller and callee, and there doesn't > seem to be a sensible answer to your question in this comment: > > /* What would ITER_ON_DEST mean for dest iterators? */ > > Here is a different proposal: Add a new virtual function to the > implementation struct: > > src_iter_init_fixed_buffer (..., buffer, stride) > > The intention of that call would be to initialize an iterator that was > required to (a) add stride to the buffer on every scanline, and (b) > always return the current value of the buffer. > > The default implementation of that initialization would then be: > > src_iter_init (&iter, buffer); > > /* override get_scanline */ > iter.saved_get_scanline = iter.get_scanline; > > iter.fixed_stride = stride; > iter.fixed_buffer = buffer; > iter.get_scanline = get_scanline_fixed_buffer; > > where get_scanline_fixed_buffer() would call the saved get_scanline() > and copy the result into the fixed buffer. > > Individual images that are already using the given buffer could simply > add code to add the stride and then rename their existing > src_iter_init() to src_iter_init_fixed_buffer(). The new > src_iter_init() would simply be a call to src_iter_init_fixed_buffer() > with a stride of 0. > > There could then be a fast path in pixman-fast-path.c that would > trigger for the SRC operator and {x,a}8r8g8b8 destinations with > PIXMAN_any and PIXMAN_null for the source and mask formats > respectively. That fast path would be very simple: > > iter_init_fixed_buffer (&iter, dst_buffer, dst_stride); > > while (height--) > iter_get_scanline (&iter); > > Even if no images implement the short-circuiting, this would still be > roughly as efficient as the existing general implementation. > > I haven't implemented the above, so I might be overlooking something, > but if it works, the interface for iterators would be quite a bit > simpler, I think.
Yes, it is simpler and in fact all the images are either using the given buffer or doing noop, so implementing it very easy (just as easy as doing a memcpy if the "noop" is performed from/to different buffers). This is not as flexible as making it possible for iterator constructors to ignore some flags, but I like the approach because it only allows the special fetching to be used on sources and not on destinations. Andrea PS: I'm still unsure if we want this optimization... I think that the gradient optimization is more likely to provide performance improvements. _______________________________________________ Pixman mailing list Pixman@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/pixman