Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-22 Thread François T .
I guess he is talking of algorithm where the entire buffer needs to be
evaluated before processing the pixels. Like knowing which is going to be
the Max Luma value in your buffer to divide all your pixels by this value.
(very simple example, but just for the case) or something like Retinex
algorithm, which recontrast localy based on value from the entire buffer, if
you only have tiles and some part of the image at the process time, that
would be an issue.
But even with that I don't understand why it would be a problem. as I
understand it, only the node itself does the tile, but the I/O of each nodes
is full buffers right ?
I don't know how this works exactly, but I can understand his fear about it,
but again, I'm pretty sure we are not the first compositor node doing tile
base right :) ?



2011/1/22 Jeroen Bakker j.bak...@atmind.nl

 On 01/21/2011 04:14 PM, Aurel W. wrote:
  You are talking about things such as convolution with a defined kernel
  size. There are other operations and a compositor truly transforms an
  image to another image and not pixels to pixels etc. If it's
  implemented in such a naive way, the compositor will be very limited.
  I got a very bad feeling about this
 
  Ok, let's normalize an image with a tile based approach,... uh damn
 it
 Aurel, don't worry on that. Tile based is that the output is part of a
 tile. But the input data can be the whole or a part of the image. On the
 technical side there will be some issues to overcome (mostly device
 memory related). Btw there are possibilities when you need every image
 pixel as input to use a intermediate to reduce memory need. I did this
 already in the defocus node.

 Please help me to determine the case when a whole output image is
 needed. IMO input is readonly and output is writeonly. I don't see the
 need atm to support whole output images in a 'per output pixel'
 approach. And every 'per input pixel' approach can be written by a 'per
 output pixel' approach. In the current nodes the two approaches are mixed.

 Jeroen

 ___
 Bf-committers mailing list
 Bf-committers@blender.org
 http://lists.blender.org/mailman/listinfo/bf-committers




-- 

François Tarlier
www.francois-tarlier.com
www.linkedin.com/in/francoistarlier
___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-22 Thread Aurel W.
Hi Jeron,

 Please help me to determine the case when a whole output image is
 needed. IMO input is readonly and output is writeonly. I don't see the
 need atm to support whole output images in a 'per output pixel'
 approach. And every 'per input pixel' approach can be written by a 'per
 output pixel' approach. In the current nodes the two approaches are mixed.
The problem of the concept of pixel to pixel operations is also that
this tends to be implemented with a lot of overhead. Like having 3
frames on the call stack for adding two pixels and this for every
pixel in the buffer, it is really nasty. This is why even adding
buffers together is rather inefficient at the moment. Another example
would be the filter node, with these pixel_processors for convolution.
If you really think about low level efficiency, down to the level of
single instructions, a lot could be done better at the moment.

I also realize that the argument it would work with the current
compositor is a strong argument. But I got some problems with that.
First of all I think that a compositor should be in principal be able
to support all image processing operations. I think it's a rather bad
idea to be stuck with a very limited architecture, which already
requires a bunch of hacks to implement the functionality of current
nodes as those doing convolution.

Another problem I see with tiling is, that you are doing spacial
partitioning and are therefore stuck in the spatial domain. But there
are a lot of possibilities of working in gradient and frequency
domain, also including speedups. But you won't be able to convert a
tile to gradient domain, because you can't determine the correct
gradient on the borders. When you want to work in frequency domain you
also run into issues with tiling, because of your spacial
partitioning.

But back to the simple issue with operations, which need full buffer
access. I agree that this could be still done with tiling, because you
can simply compute all input tiles and just access those when
computing one single output tile. So this is sort of how this should
work? At least your diagram in your document looks like this. Any
other workarounds like using overlapping tiles for the very special
case of a 3*3 kernel convolution are just hacks, but will prevent the
implementation of any future nodes, which have other non pixel-pixel
operations.

Such future node for instance could be tone-mapping. This is for e.g.
a standard feature in lux, so I guess it's not that absurd to include
such features in blenders compositor. And some tone mapping algorithms
need to operate on the entire image.

In terms of memory usage, caching, etc. if we assume that only
reasonable sized buffers are used, let's say up to 64MB, I also don't
see the strong benefits in using tiles rather than buffers, which hold
the entire image. But maybe you have to be more specific about the
caching scheme you want to use here.

aurel
___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-22 Thread Matt Ebb
On Sat, Jan 22, 2011 at 8:11 PM, Aurel W. aure...@gmail.com wrote:
 I also realize that the argument it would work with the current
 compositor is a strong argument. But I got some problems with that.
 First of all I think that a compositor should be in principal be able
 to support all image processing operations.

On the contrary, I think the compositor should be designed and
optimised for its purpose, compositing CGI/vfx imagery. It doesn't
need to be a completely generalised image processing system, it just
needs to do what it's intended for, well. So far I've seen a mostly
theoretical objections here, but I think it's important to keep
focused on enabling people to produce shots.

 But back to the simple issue with operations, which need full buffer
 access. I agree that this could be still done with tiling, because you
 can simply compute all input tiles and just access those when
 computing one single output tile.

Or rather the tiles that are necessary at any given time. In the case
of the Normalize node for example (which is mostly useless for
animated sequences, as are any tone mapping operators that work in a
similar way), it would be possible to retrieve each tile one by one in
a pre-process, read and store the statistical information, and then
apply that per tile or even per pixel.

 In terms of memory usage, caching, etc. if we assume that only
 reasonable sized buffers are used, let's say up to 64MB, I also don't
 see the strong benefits in using tiles rather than buffers, which hold
 the entire image.

The benefits are lower memory usage, and better/easier parallelisation.
___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-22 Thread Aurel W.
 On the contrary, I think the compositor should be designed and
 optimised for its purpose, compositing CGI/vfx imagery. It doesn't
 need to be a completely generalised image processing system, it just
 needs to do what it's intended for, well. So far I've seen a mostly
 theoretical objections here, but I think it's important to keep
 focused on enabling people to produce shots.
Even if we just assume the existing nodes, there are lots of issues
with tiles. To be future-proof, also other nodes and operations have
to be considered. I mentioned some, they might not be the best
examples, but they demonstrate issues with the design. Strong limits
in this tile based design are a con. Sorry if this seems to be just
theoretical objections, but just comparing a design to the existing
nodes won't do it.

 Or rather the tiles that are necessary at any given time. In the case
 of the Normalize node for example (which is mostly useless for
 animated sequences, as are any tone mapping operators that work in a
 similar way), it would be possible to retrieve each tile one by one in
 a pre-process, read and store the statistical information, and then
 apply that per tile or even per pixel.
Again, just examples for operations on images,... and I guess
tone-mapping isn't such a bad one, especially if you consider
compositing of single images, not animations.

 The benefits are lower memory usage, and better/easier parallelisation.
In practice, if you assume that your memory can hold multiple buffers
anyway, I can't significant improvements in memory usage. We also have
to distinguish between two use cases here, the one where compositing
graph is just executed once and the one where a user interactively
adjusts settings and wants to keep intermediate results in memory.
Again, there is no proposal for a caching scheme for the tiled based
solution in the interactive case yet and I can't think of anything
that would have large benefits compared to work on full buffers and
also cache those.

I also highly doubt that this will lead to better/easier
parallelisation. I still think that more fine grained parallelisation
in each individual node, operating on the entire buffer, would turn
out better in practice.

At least I want to have a discussion on this. Just to assume
prematurely, that tiles will give better performance is not a good
idea.

aurel
___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-22 Thread François T .

 Jeron : Like what are tiles in the perspective of a user/artist and

what are tiles in perspective of parallelization.

Both definitions are right, but developers and users mix the definition

and the meaning. Sorry for that.


Thats what I thought :)



Aurel : Again, just examples for operations on images,... and I guess

tone-mapping isn't such a bad one, especially if you consider

compositing of single images, not animations.


the goal here is to build a compositor and as Matt says, it should do what
it's supposed too. I know that as for now the compositor had a design to
enhanced render and mostly still (and IMO that the reason it got some
limitation today too). As xsi had its fxtree or whatever.
Thinking about using it for still is like if you were telling me you would
use After Effects to do Photoshop stuff. It is possible and yes they could
make one tool of both. But there is a good reason it is two different
software even if at the end they do really similar things.
Just to say, I believe the design should concentrate on large image memory
(4k  higher is coming in future for sure) and animation above all.




some gradiant base algorithm  very fast blur are in needs of full buffer
for sure, but I don't understand why some nodes cannot says I need full
buffer, so I'll wait all my parents to compute and give me a FB as input
and other nodes (by default) based on tiles. So only a few  would be slower
than other, but still everybody would happy ?

Actually I wonder if Nuke is not doing some kind of similar thing. Matt ?
The reason it makes me think of that, is on some Nukes scripts, I have seen
some nodes updating all together, and then one of them updating kind of
seperatly, like if it was waiting for something. But again, maybe i don't
really understand the issue here, my apology :(

F


2011/1/22 Aurel W. aure...@gmail.com

  On the contrary, I think the compositor should be designed and
  optimised for its purpose, compositing CGI/vfx imagery. It doesn't
  need to be a completely generalised image processing system, it just
  needs to do what it's intended for, well. So far I've seen a mostly
  theoretical objections here, but I think it's important to keep
  focused on enabling people to produce shots.
 Even if we just assume the existing nodes, there are lots of issues
 with tiles. To be future-proof, also other nodes and operations have
 to be considered. I mentioned some, they might not be the best
 examples, but they demonstrate issues with the design. Strong limits
 in this tile based design are a con. Sorry if this seems to be just
 theoretical objections, but just comparing a design to the existing
 nodes won't do it.

  Or rather the tiles that are necessary at any given time. In the case
  of the Normalize node for example (which is mostly useless for
  animated sequences, as are any tone mapping operators that work in a
  similar way), it would be possible to retrieve each tile one by one in
  a pre-process, read and store the statistical information, and then
  apply that per tile or even per pixel.
 Again, just examples for operations on images,... and I guess
 tone-mapping isn't such a bad one, especially if you consider
 compositing of single images, not animations.

  The benefits are lower memory usage, and better/easier parallelisation.
 In practice, if you assume that your memory can hold multiple buffers
 anyway, I can't significant improvements in memory usage. We also have
 to distinguish between two use cases here, the one where compositing
 graph is just executed once and the one where a user interactively
 adjusts settings and wants to keep intermediate results in memory.
 Again, there is no proposal for a caching scheme for the tiled based
 solution in the interactive case yet and I can't think of anything
 that would have large benefits compared to work on full buffers and
 also cache those.

 I also highly doubt that this will lead to better/easier
 parallelisation. I still think that more fine grained parallelisation
 in each individual node, operating on the entire buffer, would turn
 out better in practice.

 At least I want to have a discussion on this. Just to assume
 prematurely, that tiles will give better performance is not a good
 idea.

 aurel
 ___
 Bf-committers mailing list
 Bf-committers@blender.org
 http://lists.blender.org/mailman/listinfo/bf-committers




-- 

François Tarlier
www.francois-tarlier.com
www.linkedin.com/in/francoistarlier
___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-22 Thread Ton Roosendaal
Hi,

To be honest, long winded discussions on ways how to implement stuff  
should not take away the freedom for a developer to find out him/ 
herself the optimal cases. I'm confident that Jeroen is aware of  
boundary cases here, and he will try to find a good balance for  
practical usage.

For as long we agree on existing and future demands on compositing in  
Blender, we should give him our blessings :)

Relevant specs are for example:
- desired input methods, like storage, types, UI workflow,  
colorspaces, alpha, (plugins?)
- desired output specifications, like memory/cpu/gpu performance and  
visual feedback

-Ton-


Ton Roosendaal  Blender Foundation   t...@blender.orgwww.blender.org
Blender Institute   Entrepotdok 57A  1018AD Amsterdam   The Netherlands

On 22 Jan, 2011, at 13:28, Aurel W. wrote:

 some gradiant base algorithm  very fast blur are in needs of full  
 buffer
 for sure, but I don't understand why some nodes cannot says I need  
 full
 buffer, so I'll wait all my parents to compute and give me a FB as  
 input
 and other nodes (by default) based on tiles. So only a few  would  
 be slower
 than other, but still everybody would happy ?

 Yes something like that would be necessary. I guess in practice it
 will be very hard to determine the required tiles, so maybe there will
 be only two cases, one where only one tile is needed and the one where
 simply all tiles are needed.

 I am also worried about the memory layout of this. Single tiles would
 be computed to separate data structures, maybe just a single array.
 All tiles are computed like this for an entire image. The next node,
 which needs to operate on the entire image now has to access
 individually pixels in all tiles. So you have two options, introduce
 some sort of abstraction to access these pixels, or copy all tiles to
 a single buffer, which then gets processed. The first one adds a lot
 of overhead and cache unfriendliness. The second one also adds
 overhead and memory usage by copying. Of course this would need
 testing and better analysis, but it can tremendously slow things down.

 the goal here is to build a compositor and as Matt says, it should  
 do what
 it's supposed too.
 well, of course,...

 aurel
 ___
 Bf-committers mailing list
 Bf-committers@blender.org
 http://lists.blender.org/mailman/listinfo/bf-committers


___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-22 Thread GSR
Hi,
j.bak...@atmind.nl (2011-01-22 at 0952.44 +0100):
 image. The highest/lowest value is calculated once (not parallelized) 
 the pixel processor is parallelized.

Not very good example ;] as this searching problem is near as much
parallelizable as the pixel processor would be. Split the work into N
workers, every one gets total_pixels/N (or tiles or whatever), looking
for the local max and min. Then scan the N maxes to get the final max,
and the same with the N mins. Even if you have a system with 1024
workers, that is only an extra non parallel pass of 1024 checks
(assuming you do not parallelize it again, having four workers doing
256 each and finally compare four results, for example).

So the questions if you want to process in pixel stacks (what is the
final result for X,Y pixel before X+1,Y is known) or buffers (work in
one set of tiles and never look at them except if something down the
node tree changes). If you want the final full image, you will do the
full work in both cases anyway. Exceptions aside, you probably want
buffers approach (with tiled internal organization, that is fine),
because that way the code cache gets lots and lots of hits, and data
one probably too. The other way you are trashing all caches.

GSR
 
___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-21 Thread Jeroen Bakker
Hi Vilem

On 01/21/2011 01:12 AM, Vilem Novak wrote:
 I'd like to have 2 more questions:
 Where did go the idea of integrating gegl as the library
 driving compositor processing(originally 1 of durian targets?)?
I don't know, perhaps one of the durianers can elaborate on this. I 
myself see pro's and con's in using this library inside Blender. It has 
already solved issues we are trying to tackle now, but looking at the 
requirements that our users have, I am not sure that the library will be 
that suitable (granularity of the nodes/operators).
 Will it be harder to develop nodes for the tile based system than now? will 
 it still be possible to write
 non-tile based nodes, or non-opencl nodes?
No implementing a (tile based) node will be different, but easier. The 
hard part will not be visible to the node developer. The developer is 
not aware that OpenCL exists or that it is tiled based. There will be a 
difference as everything has to be written as pixelprocessors. Currently 
I don't have seen the requirement that non-tiled based nodes are needed. 
I have seen the requirement for non-opencl nodes (Py expressions, Image 
loading and saving, displaying, etc) so that is in scope.

Jeroen

___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-21 Thread Aurel W.
Another question, I am concerned with.

What do you mean with tiles in the context of the compositor. That a
node just processes patches/tiles of an image, so the basic entity,
which is processed becomes a tile or even a single pixel?

I hope it's commonly realized that a compositing node always has to
process an entire image globally and output an entire image. The
processing of each pixel depends on every other pixel in the entire
image not just in tiles or on the very one input pixel. It's really
that simple, a node can be expressed by a function f(image)-image and
not f(tile)-tile or f(pixel)-pixel.

Please remember this when doing any design of the new system,
otherwise things will be heavily screwed up.

aurel

On 21 January 2011 08:37, Jeroen Bakker j.bak...@atmind.nl wrote:
 On 01/21/2011 12:10 AM, Matt Ebb wrote:
 I say this not to be negative, but because there is a lot of room for
 functional improvement in blender's compositor, and if it is to be
 re-coded, it should be done with an eye to workflow and future
 abilities, not just from a purely techno-centric perspective.
 I don't see it as negative. Also I don't think that I am (cap)able to
 implement all these functional wishes/changes. They need to be thinked
 about by users/developers together. I also don't think it is good to do
 all this work we discussed in a single project. There will be a
 separation of technology and functionality. First should concentrate in
 implementing a kernel (stable ground), that is capable of our 'future'
 wishes/changes/capabilities. And secondly we need to implement the more
 functional/workflow part.

 The first part needs to know the directions of the second part (vision).
 This vision should be clear upfront, but not in details.

 Jeroen
 ___
 Bf-committers mailing list
 Bf-committers@blender.org
 http://lists.blender.org/mailman/listinfo/bf-committers

___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-21 Thread Aurel W.
On 21 January 2011 15:34, Martin Poirier the...@yahoo.com wrote:
 Not all effects needs access to all of the buffer. A lot of them only need 
 access to a neighbourhood around each pixels, for which a system of slightly 
 overlapping tiles fits the problem.

You are talking about things such as convolution with a defined kernel
size. There are other operations and a compositor truly transforms an
image to another image and not pixels to pixels etc. If it's
implemented in such a naive way, the compositor will be very limited.
I got a very bad feeling about this

Ok, let's normalize an image with a tile based approach,... uh damn it

aurel
___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-21 Thread Brecht Van Lommel
Hi Jeroen,

I'll comment on the tiling / OpenCL proposal itself in another mail later.

I agree with Matt that it would be good to address a number of design
issues first. Perhaps these could be implemented before work on tiling
or OpenCL begins.

* Automatic data type conversion between nodes.
* Storing channels non-interleaved.
* Premul vs. key alpha. We should have a convention here and stick to it.
* Color management. Also think we should decide on a convention here.
* Store transformations along with buffers.
* Change all nodes to use a get_pixel function.

Options to shuffle channels, or change color spaces can all be done
outside of nodes, as part of automatic data conversion as already
proposed. A get_pixel function would handle
procedurals/transformations automatically. These things don't seem
particularly hard to implement, but would be quite a bit of work
refactoring code.

Further, most of the things in the VFX proposal seems like they would
not have much effect on the internal workings, more about UI and
different ways to get data in/out of the compositor.

Brecht.
___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-21 Thread Brecht Van Lommel
Hi Jeroen,

Node compiler. I really dislike code going through a node compiler.
OpenCL is basically C with some restrictions and a few extra
qualifiers, let's take advantage of that and for the CPU just compile
as part of the regular compilation process. It's not clear to me why
this node compiler is necessary. What I would propose is to just
#include the kernel into the CMP_*, and call it directly from there.
There's a few things to do to make that work, but still seems simpler
than having a makenodes inbetween.

Automatic data conversion between nodes. What I'm not sure about is
the different color data types (RGBA, HSVA, YUVA, ..). This would no
be exposed to the user, all they would know is that it's a Color,
right? Is it really necessary to have these as core data types, can't
the nodes do such conversions if they want to?

Kernel types. To me it seems perhaps better to not classify kernel
types this way, but to classify buffer inputs as either random access
or not. I'm not sure about how you planned to do kernel grouping, but
thinking of it this way also makes it possible to group two blurs that
are then mixed together for example.

Memory buffers states. Not clear to me why these states are stored as
part of the buffer itself, seems to me some of these are more related
to the node execution, not the memory manager.

Consistency between GPU and CPU. New CUDA GPU's can actually do
identical floating points ops, if you're careful. If you use
optimizations like fast math, SSE, fused mutliply-add, this becomes
harder. My guess is that differences will nearly all the time be too
small to be visible regardless, since colors don't need that many bits
of precision.

Another problem may be that some types of optimizations run well on
the CPU but are harder on the GPU. Would it still be possible to have
such CPU optimized implementations, or would everything have to be
done in kernels?

Brecht.
___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-21 Thread Troy Sobotka
On Fri, Jan 21, 2011 at 8:29 AM, Brecht Van Lommel
brechtvanlom...@pandora.be wrote:
 Automatic data conversion between nodes. What I'm not sure about is
 the different color data types (RGBA, HSVA, YUVA, ..). This would no
 be exposed to the user, all they would know is that it's a Color,
 right? Is it really necessary to have these as core data types, can't
 the nodes do such conversions if they want to?

Sadly RGB to YCbCr or vice versa _would_ need to be exposed as a
result of different color matrices. You would need to specify Rec601
versus Rec709.

Xat can speak to this further.

I'm certain there are cases that aren't immediately obvious that would
require exposing the color model.
___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-21 Thread Ton Roosendaal
Hi Brecht,

Great remarks, and several of these I can assist on too. The get-pixel  
one will be toughest though... I know several nodes have been heavily  
optimized to use rows.

I need to study Jeroen's proposal in detail still, can't say much...  
but based on all the feedback he received, it's definitely good to try  
to limit the scope of work as much as possible, or define a step by  
step migration path.

Reply to some concerns here;

- The compositor should run always good on CPU (multi core) too. I'm  
convinced it would benefit OpenCL's thread balancing a lot already. A  
bit of performance loss compared to a full native pthread  
implementation (like 10-20% ?) is acceptable, provided the GPU gains  
are very evident.

- My impression is that non-openCL usage will be mostly on render  
farms, and nearly every average-to-decent 3D workstation will have  
excellent GPU performance. Artist time is still far more valuable than  
computer time :)

- A tile-based subdivision schedule is for two reasons; efficient  
memory use (valid for cpu and gpu alike) and for a potential efficient  
threading setup. The latter has to be carefully designed, to prevent  
bottlenecks on individual nodes that need full buffers (like DOF,  
Vector Blur).


-Ton-


Ton Roosendaal  Blender Foundation   t...@blender.orgwww.blender.org
Blender Institute   Entrepotdok 57A  1018AD Amsterdam   The Netherlands

On 21 Jan, 2011, at 16:55, Brecht Van Lommel wrote:

 Hi Jeroen,

 I'll comment on the tiling / OpenCL proposal itself in another mail  
 later.

 I agree with Matt that it would be good to address a number of design
 issues first. Perhaps these could be implemented before work on tiling
 or OpenCL begins.

 * Automatic data type conversion between nodes.
 * Storing channels non-interleaved.
 * Premul vs. key alpha. We should have a convention here and stick  
 to it.
 * Color management. Also think we should decide on a convention here.
 * Store transformations along with buffers.
 * Change all nodes to use a get_pixel function.

 Options to shuffle channels, or change color spaces can all be done
 outside of nodes, as part of automatic data conversion as already
 proposed. A get_pixel function would handle
 procedurals/transformations automatically. These things don't seem
 particularly hard to implement, but would be quite a bit of work
 refactoring code.

 Further, most of the things in the VFX proposal seems like they would
 not have much effect on the internal workings, more about UI and
 different ways to get data in/out of the compositor.

 Brecht.
 ___
 Bf-committers mailing list
 Bf-committers@blender.org
 http://lists.blender.org/mailman/listinfo/bf-committers

___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-21 Thread Martin Poirier


--- On Fri, 1/21/11, Ton Roosendaal t...@blender.org wrote:

 - A tile-based subdivision schedule is for two reasons;
 efficient  
 memory use (valid for cpu and gpu alike) and for a
 potential efficient  
 threading setup. The latter has to be carefully designed,
 to prevent  
 bottlenecks on individual nodes that need full buffers
 (like DOF,  
 Vector Blur).

DOF and Blur you can take care of with overlapping source tiles as long as you 
know the maximum fetch distance (the blur radius, basically). Takes a bit more 
memory but it means you can parallelize them pretty much how you want (with 
diminishing return because the overlap zone size is constant).

Martin


___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-21 Thread Aurel W.
 DOF and Blur you can take care of with overlapping source tiles as long as 
 you know the maximum fetch distance (the blur radius, basically). Takes a bit 
 more memory but it means you can parallelize them pretty much how you want 
 (with diminishing return because the overlap zone size is constant).

Hi, there are many nodes, where this won't be easy, and they really
need full buffer access. Just computing overlapping patches for a
simple convolution case gets far too complicated and is really not
flexible at all.

Let's assume you have filter node, with a lot of iterations, so
several convolutions taking place. The patch based approach fails
here, since you would need to access also the updated regions in the
other tiles, which were computed by an other patch. The only solution
is to grow the the overlapping areas depending on the number of
iterations.

Let's assume we have a convolution node as DOF, which does several
iterations and has a long graph as an input node. Essentially the
patch size has to be changed each time the you adjust a setting in the
node and therefore the entire sub graph has to be evaluated again.

Changing patch sizes? That doesn't make sense to me and really gets
overcomplicated. Full buffer access is needed in this case as I
pointed out previously. There are also other operations, which need
access to the entire buffer to determine a single pixel.

Again, I have a very bad feeling about this patch based approach.

aurel
___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-21 Thread Brecht Van Lommel
Hi,

On Fri, Jan 21, 2011 at 10:31 PM, Aurel W. aure...@gmail.com wrote:
 DOF and Blur you can take care of with overlapping source tiles as long as 
 you know the maximum fetch distance (the blur radius, basically). Takes a 
 bit more memory but it means you can parallelize them pretty much how you 
 want (with diminishing return because the overlap zone size is constant).

 Hi, there are many nodes, where this won't be easy, and they really
 need full buffer access. Just computing overlapping patches for a
 simple convolution case gets far too complicated and is really not
 flexible at all.

 Let's assume you have filter node, with a lot of iterations, so
 several convolutions taking place. The patch based approach fails
 here, since you would need to access also the updated regions in the
 other tiles, which were computed by an other patch. The only solution
 is to grow the the overlapping areas depending on the number of
 iterations.

Another solution is to execute kernels multiple times, and load/unload
tiles each time. For each iteration you only need the same region, not
a larger region.

 Let's assume we have a convolution node as DOF, which does several
 iterations and has a long graph as an input node. Essentially the
 patch size has to be changed each time the you adjust a setting in the
 node and therefore the entire sub graph has to be evaluated again.

 Changing patch sizes? That doesn't make sense to me and really gets
 overcomplicated. Full buffer access is needed in this case as I
 pointed out previously. There are also other operations, which need
 access to the entire buffer to determine a single pixel.

 Again, I have a very bad feeling about this patch based approach.

I can't think of current nodes that would not work with such a tile
based approach, with some implementation tweaks. But I'm not sure what
your point is though, do you think there is a different, better way to
handle buffers larger than memory, or do you think it's impossible?

Brecht.
___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-21 Thread Aurel W.
 Another solution is to execute kernels multiple times, and load/unload
 tiles each time. For each iteration you only need the same region, not
 a larger region.
Things start to get a little confusing now. I thought that the entire
graph or one output node can be evaluated for a single tile. At least
this is how I understand the proposed tiled based system should work.
Am I wrong in this case?

So what you try to say is, that for one filter node, all tiles of the
image have to be computed for each single iteration, where tiles have
an overlapping area of the filter size. In the next iteration, the
tiles are newly loaded containing also the results from neighboring
tiles from the last iteration? And this has to be implemented somehow
in the filter node then? So in the end, image by image is convoluted
and the next iteration can't start before all tiles have finished?

I sort of ment this by the full buffer is needed.

 I can't think of current nodes that would not work with such a tile
 based approach, with some implementation tweaks. But I'm not sure what
 your point is though, do you think there is a different, better way to
 handle buffers larger than memory, or do you think it's impossible?
I have no doubt, that it wouldn't work, the question is how efficient
it is and if there are no better solutions and if this is really
necessary. So if I get this right, the only reason for tiling is to
handle large buffers? Large as in larger than main memory order video
ram in case of opencl? I am not sure, if this is really necessary and
to handle such large images, also other things have to be adapted to
support this, like the image viewer, exr loading, renderbuffer,... So
are there really plans to support rendering/viewing/compositing like
32k images in blender from now on?

I agree, that tiling would be the only way to support the processing
of images larger than main memory. But I don't think that it will give
better performance and I also think that it introduces a lot of
unnecessary complexity.

aurel
___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-21 Thread Jeroen Bakker
On 01/21/2011 04:14 PM, Aurel W. wrote:
 You are talking about things such as convolution with a defined kernel
 size. There are other operations and a compositor truly transforms an
 image to another image and not pixels to pixels etc. If it's
 implemented in such a naive way, the compositor will be very limited.
 I got a very bad feeling about this

 Ok, let's normalize an image with a tile based approach,... uh damn it
Aurel, don't worry on that. Tile based is that the output is part of a 
tile. But the input data can be the whole or a part of the image. On the 
technical side there will be some issues to overcome (mostly device 
memory related). Btw there are possibilities when you need every image 
pixel as input to use a intermediate to reduce memory need. I did this 
already in the defocus node.

Please help me to determine the case when a whole output image is 
needed. IMO input is readonly and output is writeonly. I don't see the 
need atm to support whole output images in a 'per output pixel' 
approach. And every 'per input pixel' approach can be written by a 'per 
output pixel' approach. In the current nodes the two approaches are mixed.

Jeroen

___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-20 Thread Lukas Tönne
There are a couple of things i'd like to note, especially those not
directly related to OpenCL vs. CPU code (most arguments have been
voiced already):

* On the question whether horizontal layout (Blender nodes,
Softimage), vertical layout (Houdini, Aviary) or completely customized
layout (Nuke) is preferable: I'd like to point out that it would
probably be difficult to use socket names and default input values for
sockets with anything other than horizontal nodes. Most softwares that
use a different layout approach seem to have just one single type of
socket data, depending on the type of tree. For compositing systems
this is simply the image buffer you want to manipulate, for more
complex systems (such as Houdini) a socket connection can mean a
parent-child object relation or vertex or particle data, etc.,
depending on the type of tree.

* While the restriction to one single data type in a tree allows very
clean layout and easily understandable data flow in trees, it also
means that there needs to be a different way of controlling node
parameters, which usually means scripted expressions. Currently many
nodes in Blender have sockets that simply allow you to use variable
parameters, calculated from input data with math nodes or other node's
results. Afaik the equivalent to expressions in Blender would be the
driver system, but making this into a feature that is generic enough
to replace node-based inputs is probably a lot more work than only a
compositor recode (correct me if i'm wrong).

* Having a general system for referencing scene data could be
extremely useful, especially for the types of trees in the domain i am
working in: particle sims (and mesh modifiers lately). In compositor
nodes the only real data that must occasionally be referenced is the
camera (maybe later on curves can be useful for masking? just a rough
idea). For simulation nodes having access to objects, textures, lamps,
etc. is much more crucial even.

We discussed already that such references/pointers would have to be
constants, which means that their concrete value is already defined
during tree construction and not only when executing. This makes it
possible to read the data at the beginning of execution and convert it
to OpenCL readable format. Also it will allow to keep track of data
dependencies (not much of an issue in compositor, but again very
important for simulations). Note that there are already some places
where data is linked in a tree (e.g. material and texture nodes), but
these are not implemented as sockets and so don't allow efficient
reuse of their input values by linking.

* I would love to see the memory manager you are planning for tiled
compositing be abstracted just a little more, so that it can be used
for data other than image buffers too. In simulations of millions of
particles the buffers could easily reach sizes comparable to those in
compositing, so it would be a good idea to split them into parts and
process these individually where possible.

In images the pixels all have fixed locations and you can easily
define neighboring tiles to do convolutions. This kind of calculation
is usually not present in arbitrary or unconnected data, such as
particles or mesh vertices, so an element/tile/part will either depend
on just one of the input parts or all of them. But still having a
generic manager for loading parts into memory could avoid some double
work.

Cheers,
Lukas
___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-20 Thread Jeroen Bakker
Hi Lukas,

Spaghetti vs Expressions :) : I agree with your conclusion. I really see 
this as that there are to few control parameters on a node and limited 
node implementations. Also currently we have different granularities of 
nodes. We have very functional nodes, and very mathematically nodes and 
data (combine, split) nodes. I would make there mathematically nodes 
part of the functional node. The data nodes are perhaps not needed 
anymore when you have a single datatype and color modes in the node 
itself. Currently the defocus node is 2d. but is only useful in 3d. 
Therefore compositors will create complex systems with z-clips and 
render layars to first split the image in layers, defocus every layer on 
its own and combine these layers.

The same effect you see with vector blur and 2 objects moving in the 
opposite direction. As the depth is not used during the calculation, you 
need to split, calculate and combine.

Some generic way to reference scene data: Yes, I will redo that part of 
the proposal. But at the moment I don't have the solution for every 
case. In the compositor the need of the data should be part of the 
kernel that will use the data. But as the compositor only has limited 
scene references I don't know the ideal solution for this.
Currently will support camera data, and render data (current frame) and 
compositor settings (default color mode?)

More abstract memory manager: I agree! I wouldn't implement this fixed 
to the compositor situation. I was thinking about something like:
  - alloc(deviceId, len(Struct), width, height) for images 2D
  - alloc(deviceId, len(Struct), size) for arrays 1D
The compositor also uses the array Allocation for un/n-ary based kernel 
groups.

Jeroen.

On 01/20/2011 09:48 AM, Lukas Tönne wrote:
 There are a couple of things i'd like to note, especially those not
 directly related to OpenCL vs. CPU code (most arguments have been
 voiced already):

 * On the question whether horizontal layout (Blender nodes,
 Softimage), vertical layout (Houdini, Aviary) or completely customized
 layout (Nuke) is preferable: I'd like to point out that it would
 probably be difficult to use socket names and default input values for
 sockets with anything other than horizontal nodes. Most softwares that
 use a different layout approach seem to have just one single type of
 socket data, depending on the type of tree. For compositing systems
 this is simply the image buffer you want to manipulate, for more
 complex systems (such as Houdini) a socket connection can mean a
 parent-child object relation or vertex or particle data, etc.,
 depending on the type of tree.

 * While the restriction to one single data type in a tree allows very
 clean layout and easily understandable data flow in trees, it also
 means that there needs to be a different way of controlling node
 parameters, which usually means scripted expressions. Currently many
 nodes in Blender have sockets that simply allow you to use variable
 parameters, calculated from input data with math nodes or other node's
 results. Afaik the equivalent to expressions in Blender would be the
 driver system, but making this into a feature that is generic enough
 to replace node-based inputs is probably a lot more work than only a
 compositor recode (correct me if i'm wrong).

 * Having a general system for referencing scene data could be
 extremely useful, especially for the types of trees in the domain i am
 working in: particle sims (and mesh modifiers lately). In compositor
 nodes the only real data that must occasionally be referenced is the
 camera (maybe later on curves can be useful for masking? just a rough
 idea). For simulation nodes having access to objects, textures, lamps,
 etc. is much more crucial even.

 We discussed already that such references/pointers would have to be
 constants, which means that their concrete value is already defined
 during tree construction and not only when executing. This makes it
 possible to read the data at the beginning of execution and convert it
 to OpenCL readable format. Also it will allow to keep track of data
 dependencies (not much of an issue in compositor, but again very
 important for simulations). Note that there are already some places
 where data is linked in a tree (e.g. material and texture nodes), but
 these are not implemented as sockets and so don't allow efficient
 reuse of their input values by linking.

 * I would love to see the memory manager you are planning for tiled
 compositing be abstracted just a little more, so that it can be used
 for data other than image buffers too. In simulations of millions of
 particles the buffers could easily reach sizes comparable to those in
 compositing, so it would be a good idea to split them into parts and
 process these individually where possible.

 In images the pixels all have fixed locations and you can easily
 define neighboring tiles to do convolutions. This kind of calculation
 is usually not present in 

Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-20 Thread François T .
btw, you confirm that Expression PY  Python Pixelizer are two different
thing right ?
by Expression PY, I meant behing able to type some python for each
parameter. (I don't have any UI design for that, but it could be something
like : right click on a param  click on Set Expression  a textfield popup
to type the expression  click OK  the param goes to red color to show it
is control by an expression)

while a Python pixelizer as I understand it is more like a pixel processing
node (like the Expression Node in Nuke) which could be coded via Py (which
can call ocl, glsl, c, ... functions) ? And I understand this is not a
performance processing approach, but as in a production  artist side... not
everyone can have time or capabilities to create a new node/filter in C and
recompile the entire blender while sometimes an simple equation could just
be type and get the results.

I think that for a list of missing nodes or nodes to get rid of it,
Sebastian  Pablo should join the talk :)

Thx

François,

2011/1/19 Jeroen Bakker j.bak...@atmind.nl

 Hi Francois!

 well... my answer is still in a very early draft :). Today I took the
 time to dive into your posting in detail. I missed some parts when I
 read it first.

 Expression PY and Python pixelizer is do-able. I just need to include
 this in the proposal. I really see the value of having this. In my
 perception this is on all settings of a node (in Nuke I thought it was
 in a different node. perhaps we can borrow this idea from them.) also a
 Python based pixelizer can be done. but can have some limitation in
 performance. But that is to the artist to decide.

 On OpenFX I still haven't looked into the details. Currently I think
 that the bottleneck of implementing this is more on the Blender UI. As
 OpenFX has plugin capabilities the UI should be capable of handling
 flexible node settings etc based on your installed plugins. And also the
 reading and writing to a blend file is not that flexible (yet). I will
 spend some time next month in this subject. Also an issue is that it is
 Windows only. They state that a port is in the planning to Linux.

 Also what I personally miss is to really be able to twitch the internals
 of a node. In October I have, with the knowledge of Ton, reverse
 engineered the defocus node and came to a conclusion that it was
 implemented differently than we initially thought. Also when reading how
 many 'feature request' on this node is are placed in the bug-tracker and
 the complexity of the node there should me more options on the node to
 finetune the usage. This way nodes can become more generally usable. The
 main settings could be altered on the node, but the detailed settings
 can be altered in a panel (the N-key in the compositor). There is more
 place and will be more dynamic as it is python based.

 A different thing I want to change in the proposal is the current
 connector types. Currently they are buffers of 1, 2, 3 or 4 float
 values, representing Value, Vector and Color. The node system is
 flexible, but simple tasks can become a spaghetti of lines. I think that
 when we put all data of a single pixel in a single type. When connecting
 a rendered layer to the vector-blur node for example, will be one line,
 containing all needed data. Tweaking of the vectors can be done by
 settings, or by an expression node. This way we can reduce the number of
 links and make the node system easier to use. Perhaps we will introduce
 some limitations, but they can be tweaked. The node system will be
 cleaner (in functionality and usage). Also color model should be
 included in this data type.

 My question back is in this kind of situation, what nodes do we expect,
 and what nodes shall not be used anymore.

 I really like this discussion. It will take the proposal to the next level!

 Jeroen.

 On 01/19/2011 12:30 AM, François T. wrote:
  thx for answering to my blog post via your proposal, to answer some of
 your
  questions there :
 
  *expression py* -  only because it is User/Artist oriented. While python
 is
  great for doing this kind of stuff and pretty popular to most people, I'm
  not so sure about openCL language. by the way this is not a way to make
 new
  node, it is just a node which can control some parameter or datas in your
  comp.
  Look at what is done with Expression in AE :
  http://www.videocopilot.net/basic/tutorials/09.Expressions/ I don't
 think it
  does need OCL power to do this kind of thing. Probably more for the Nuke
  kind of Expression node because it can be do some pixel processing, but
 then
  it is just a wrapper ?
  Maybe on a programmer stand point of view it needs to be openCL or
  whatever... maybe not for the front end user. IMO this needs to be
  consistent with the rest of the scripting language in Blender. Again
  production tool :)
 
 
  *custom passes* are not mask, they are just render passes (normal, P
 pass,
  vector pass... ), but more on a 3d render side rather than the
 compositor.
 
  

Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-20 Thread Matt Ebb
On Fri, Jan 21, 2011 at 4:49 AM, Jeroen Bakker j.bak...@atmind.nl wrote:
 Hi Lukas,

 Spaghetti vs Expressions :) : I agree with your conclusion. I really see
 this as that there are to few control parameters on a node and limited
 node implementations

It doesn't have to be a matter of spaghetti vs expressions. While
Houdini uses a lot of expressions to manage multiple types of data
flowing through one wire, it's not the only way. Many modern node
based compositors handle several image planes per wire - in fusion
there is a generic 'channel booleans' node to switch them around, in
Nuke there are re-ordering nodes, even Shake had a simple text field
where you could mention what RGBA channels and ordering would be
processed and output from the input (not an expression). It would be
very easy with the right internal design to have consistent options
for each node to choose what input channels it will work on and what
it will output. I mention this in some of my replies to francois'
blog.

It seems to me that this discussion is veering towards issues that
impact workflow design, not just speed optimisation. I personally
think this is a good thing - there are several things that can and
should be modernised inside the compositor that would require
re-coding, and the question should always be 'how can this enable
users to produce work faster', rather than 'how do we integrate
library/technology X'. This also comes with a different set of
requirements though, if you're talking about changes to workflow, it
requires more research into this aspect of user interaction, eg.
understanding how other similar applications work and what can be
learned from it, looking at how professional compositors do things on
a daily basis, etc, not just coming up with ideas in isolation.

I say this not to be negative, but because there is a lot of room for
functional improvement in blender's compositor, and if it is to be
re-coded, it should be done with an eye to workflow and future
abilities, not just from a purely techno-centric perspective.

cheers

Matt
___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-20 Thread Vilem Novak
I'd like to have 2 more questions:
Where did go the idea of integrating gegl as the library 
driving compositor processing(originally 1 of durian targets?)?
Btw, gegl just released a new version 0.1.4

Will it be harder to develop nodes for the tile based system than now? will it 
still be possible to write
non-tile based nodes, or non-opencl nodes?

Thanks Vilem
___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-18 Thread François T .
thx for answering to my blog post via your proposal, to answer some of your
questions there :

*expression py* - only because it is User/Artist oriented. While python is
great for doing this kind of stuff and pretty popular to most people, I'm
not so sure about openCL language. by the way this is not a way to make new
node, it is just a node which can control some parameter or datas in your
comp.
Look at what is done with Expression in AE :
http://www.videocopilot.net/basic/tutorials/09.Expressions/ I don't think it
does need OCL power to do this kind of thing. Probably more for the Nuke
kind of Expression node because it can be do some pixel processing, but then
it is just a wrapper ?
Maybe on a programmer stand point of view it needs to be openCL or
whatever... maybe not for the front end user. IMO this needs to be
consistent with the rest of the scripting language in Blender. Again
production tool :)


*custom passes* are not mask, they are just render passes (normal, P pass,
vector pass... ), but more on a 3d render side rather than the compositor.

*masks* if you refer to the Addon RotoBezier, then yes it is still to be
done IMO. this should be a native tool with all the features that comes
with. Probably a new node any way. As I said RotoBezier is a great work
around in the mean time, but not a production tool at all.

*openFX* please pretty please :D


F




2011/1/16 Erwin Coumans erwin.coum...@gmail.com

 Bullet uses its own MiniCL fallback, it requires no external references,
 the main issue is that it is not a full OpenCL implementation (no barriers
 yet etc). We developed MiniCL primarily for debugging and secondary to run
 the Bullet OpenCL kernels on platforms that lack an OpenCL implementation.

 The Intel and AMD OpenCL drivers for CPU perform similar to regular multi
 threaded code (pthreads, openpm etc) but it is more suitable for data
 parallel problems and not for complex code with many branches.

 So while you can port a compositor or cloth simulation to OpenCL, most
 general purpose code requires large refactoring and simplification causing
 reduced quality, so don't expect miracles.

 Still, it will be fun to see compositing, physics simulation etc in Blender
 being accelerated through OpenCL, optionally.

 Thanks,
 Erwin

 On Jan 16, 2011, at 5:34 AM, Jeroen Bakker j.bak...@atmind.nl wrote:

  On 01/15/2011 03:55 PM, (Ry)akiotakis (An)tonis wrote:
  On 15 January 2011 09:19, Matt Ebbm...@mke3.net  wrote:
  While I can believe that there will be dedicated online farms set up
  for this sort of thing I was more referring to farms in animation
  studios, most of which are not designed around GPU power - now, and
  nor probably for a while in the future. Even imagining if in the
  future blender uses openCL heavily, if a studio has not designed a
  farm specifically for blender (which is quite rare), CPU performance
  will continue to be very important. I'm curious how openCL translates
  to CPU multiprocessing performance, especially in comparison with
  using something like blender's existing pthread wrapper.
 
  cheers,
 
  Matt
  ___
  Bf-committers mailing list
  Bf-committers@blender.org
  http://lists.blender.org/mailman/listinfo/bf-committers
 
  I have to disagree on that. Almost every 'serious' user today has an
  OpenCL capable GPU and they can benefit from an OpenCL implementation.
  Besides OpenCL allows for utilization of both CPU and GPU at the same
  time. It's not as if it sets a restriction on CPUs.
  In my understanding the issue is that internal renderfarms have no
  'OpenCL' capable GPU (yet). It is not an issue on the user side. Like
  during durian, we have workstations with medium gpu's and only cpu based
  renderfarm. The question is how would a cpu-based renderfarm benefit
  from opencl?
 
  Users on the otherhand have different issues. Our user population also
  have non OpenCL capable hardware/OS's. therefore we still need a full
  CPU-based fallback or the bulletsolution by implementing an own opencl
  driver. The bullet solution is complicated in our situation as it needs
  a lot of external references (compilers, linkers, loaders etc)
 
  Jeroen
  ___
  Bf-committers mailing list
  Bf-committers@blender.org
  http://lists.blender.org/mailman/listinfo/bf-committers
 ___
 Bf-committers mailing list
 Bf-committers@blender.org
 http://lists.blender.org/mailman/listinfo/bf-committers




-- 

François Tarlier
www.francois-tarlier.com
www.linkedin.com/in/francoistarlier
___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-18 Thread Ian Johnson
I would just like to chime in on this proposal with my personal experience
developing in OpenCL for use in the Blender Game Engine.
As has been pointed out, not everything can be sped up with OpenCL, and
because it supports multiple device architectures, a code optimized for the
GPU won't run fast on the CPU.
Then there is the question of user's having the hardware to even run
it, necessitating a CPU only fall-back. With all these factors one might
ask, is it worth it?
I personally think it is very well worth it, especially if it is viewed as
an optional accelerator rather than wholesale algorithm replacement. The
speed benefits for the highly parallelizable problems already mentioned such
as compositing/filters as well as physics such as particle systems (plug:
http://enja.org/2010/12/16/particles-in-bge-fluids-in-real-time-with-opencl/ )
are very convincing. There is a lot of research going into GPU computing for
CG applications, and NVIDIA is pushing CUDA hard. While Blender won't adopt
a proprietary solution such as CUDA, many of the algorithms and techniques
developed for it can be translated to OpenCL.

I'm excited about this proposal not because I want faster compositing, but
because it sets up a framework for dealing with OpenCL in a sane way inside
Blender. I'm currently developing my library standalone and linking it to
Blender, using my own OpenCL wrappers around the Khronos ones. As I learn
more about the Blender codebase, as well as look to Bullet I am dismayed by
my own code's fragility. Sure it runs fast on the machines I've tested but I
do not trust it to be in a consumer facing application for a while. As a
student and a researcher I'm compelled to spend most of my time developing
the algorithm and as much as I'd like to integrate my code cleanly it will
be a while before that can happen. This proposal would give me as a
developer a better platform for contributing directly to Blender, as well as
a central location for me to put any effort into standardizing an OpenCL
interface based on my experience with it. Furthermore, as other developers
start to accelerate their code we will need a solid way of managing device
resources and avoid redundant or competing memory transfers.

With the new architectures coming out, the prevalence of capable GPUs and
the increasingly sophisticated algorithms available I think OpenCL is going
to be essential. I'd like to throw what little weight I have behind this
proposal along with my 2 cents :)
Ian


Hi all,


The last few months I have worked hard on a the proposal of the OpenCL

based compositor. Currently the proposal is ready that it is clear how

the solution should work and what the impact is. As the proposal is on

the technical level the end-user won't feel a difference, except for a

fast tile based compositor system. In functionality it should be the same.


There are 2 aspects that will be solved:

 * Tiled based compositing

 * OpenCL compositing


To implement these I will introduce additional components:

 * Tiled based memory manager

 * Node (pre-)compiler

 * Configurable automatically data conversion for compositor node systems

 * OpenCL driver manager

 * OpenCL configuration screen

 * Some debug information:

   * OpenCL program, performance etc.

   * Execution tree (including data types, resolution and kernelgrouping)

   * Visualizing tiles needed for calculation of an area.


And introduce several new data types

 * Kernels and KernelGroup

 * Camera data type

 * Various color data types


I have put all the documents on a project-website for review. As the

proposal is quite long and complex. (all decisions are connected with

each other.)

Please use bf-committers or #blendercoders to discuss the proposal also

if something is not clear.


http://ocl.atmind.nl/doku.php?id=design:proposal:compositor-redesign


Cheers,

Jeroen Bakker

-- 
Ian Johnson
http://enja.org
___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-16 Thread Jeroen Bakker
On 01/15/2011 03:55 PM, (Ry)akiotakis (An)tonis wrote:
 On 15 January 2011 09:19, Matt Ebbm...@mke3.net  wrote:
 While I can believe that there will be dedicated online farms set up
 for this sort of thing I was more referring to farms in animation
 studios, most of which are not designed around GPU power - now, and
 nor probably for a while in the future. Even imagining if in the
 future blender uses openCL heavily, if a studio has not designed a
 farm specifically for blender (which is quite rare), CPU performance
 will continue to be very important. I'm curious how openCL translates
 to CPU multiprocessing performance, especially in comparison with
 using something like blender's existing pthread wrapper.

 cheers,

 Matt
 ___
 Bf-committers mailing list
 Bf-committers@blender.org
 http://lists.blender.org/mailman/listinfo/bf-committers

 I have to disagree on that. Almost every 'serious' user today has an
 OpenCL capable GPU and they can benefit from an OpenCL implementation.
 Besides OpenCL allows for utilization of both CPU and GPU at the same
 time. It's not as if it sets a restriction on CPUs.
In my understanding the issue is that internal renderfarms have no 
'OpenCL' capable GPU (yet). It is not an issue on the user side. Like 
during durian, we have workstations with medium gpu's and only cpu based 
renderfarm. The question is how would a cpu-based renderfarm benefit 
from opencl?

Users on the otherhand have different issues. Our user population also 
have non OpenCL capable hardware/OS's. therefore we still need a full 
CPU-based fallback or the bulletsolution by implementing an own opencl 
driver. The bullet solution is complicated in our situation as it needs 
a lot of external references (compilers, linkers, loaders etc)

Jeroen
___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-16 Thread Erwin Coumans
Bullet uses its own MiniCL fallback, it requires no external references, the 
main issue is that it is not a full OpenCL implementation (no barriers yet 
etc). We developed MiniCL primarily for debugging and secondary to run the 
Bullet OpenCL kernels on platforms that lack an OpenCL implementation.

The Intel and AMD OpenCL drivers for CPU perform similar to regular multi 
threaded code (pthreads, openpm etc) but it is more suitable for data parallel 
problems and not for complex code with many branches.

So while you can port a compositor or cloth simulation to OpenCL, most general 
purpose code requires large refactoring and simplification causing reduced 
quality, so don't expect miracles.

Still, it will be fun to see compositing, physics simulation etc in Blender 
being accelerated through OpenCL, optionally.

Thanks,
Erwin

On Jan 16, 2011, at 5:34 AM, Jeroen Bakker j.bak...@atmind.nl wrote:

 On 01/15/2011 03:55 PM, (Ry)akiotakis (An)tonis wrote:
 On 15 January 2011 09:19, Matt Ebbm...@mke3.net  wrote:
 While I can believe that there will be dedicated online farms set up
 for this sort of thing I was more referring to farms in animation
 studios, most of which are not designed around GPU power - now, and
 nor probably for a while in the future. Even imagining if in the
 future blender uses openCL heavily, if a studio has not designed a
 farm specifically for blender (which is quite rare), CPU performance
 will continue to be very important. I'm curious how openCL translates
 to CPU multiprocessing performance, especially in comparison with
 using something like blender's existing pthread wrapper.
 
 cheers,
 
 Matt
 ___
 Bf-committers mailing list
 Bf-committers@blender.org
 http://lists.blender.org/mailman/listinfo/bf-committers
 
 I have to disagree on that. Almost every 'serious' user today has an
 OpenCL capable GPU and they can benefit from an OpenCL implementation.
 Besides OpenCL allows for utilization of both CPU and GPU at the same
 time. It's not as if it sets a restriction on CPUs.
 In my understanding the issue is that internal renderfarms have no 
 'OpenCL' capable GPU (yet). It is not an issue on the user side. Like 
 during durian, we have workstations with medium gpu's and only cpu based 
 renderfarm. The question is how would a cpu-based renderfarm benefit 
 from opencl?
 
 Users on the otherhand have different issues. Our user population also 
 have non OpenCL capable hardware/OS's. therefore we still need a full 
 CPU-based fallback or the bulletsolution by implementing an own opencl 
 driver. The bullet solution is complicated in our situation as it needs 
 a lot of external references (compilers, linkers, loaders etc)
 
 Jeroen
 ___
 Bf-committers mailing list
 Bf-committers@blender.org
 http://lists.blender.org/mailman/listinfo/bf-committers
___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-15 Thread Jeroen Bakker


On 01/15/2011 08:19 AM, Matt Ebb wrote:
 Thanks, Jeroen.

 On Sat, Jan 15, 2011 at 6:04 PM, Jeroen Bakkerj.bak...@atmind.nl  wrote:
 Farms
 are already being migrated to OpenCL farms. As they are cheaper in
 hardware costs.

 BTW. renderfarm.fi should be capable of running OpenCL as this is
 proposal is implemented!
 While I can believe that there will be dedicated online farms set up
 for this sort of thing I was more referring to farms in animation
 studios, most of which are not designed around GPU power - now, and
 nor probably for a while in the future. Even imagining if in the
 future blender uses openCL heavily, if a studio has not designed a
 farm specifically for blender (which is quite rare), CPU performance
 will continue to be very important. I'm curious how openCL translates
 to CPU multiprocessing performance, especially in comparison with
 using something like blender's existing pthread wrapper.
Thanks for your insight. If you only have OpenCL for CPU (AMD, Intel) it 
is hard to determine the results.
1. The OpenCL code is compiled to native code and executed as a shared 
library. The code itself should run without speed loss.
2. You have the overhead of the task scheduler in the OpenCL driver. - 
Speed decreases
3. You can utilize your hardware better - increase of speed.

If you compare non OpenCL CPU with OpenCL CPU, you really can't say 
anything about which one is faster, because your comparing two different 
styles and implementations. It depends on the actual implementation that 
you will test.

Currently AMD has better support for openCL, but in the short future 
Intel will be up to speed.
Also bullet physics engine 3.x will uses OpenCL and as most studios have 
bullet somewhere... Still I have not yet tested it.

Jeroen
___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-15 Thread Vilem Novak
Maybe an interesting comparison could be smalluxGPU, since it has both cpu only 
and cpu-opencl enabled, so you can compare performance on cpu with both ways 
with getting similar results, although in raytracing.

  Původní zpráva 
 Od: Matt Ebb m...@mke3.net
 Předmět: Re: [Bf-committers] Proposal: Blender OpenCL compositor
 Datum: 15.1.2011 08:19:25
 
 Thanks, Jeroen.
 
 On Sat, Jan 15, 2011 at 6:04 PM, Jeroen Bakker j.bak...@atmind.nl wrote:
 Farms
  are already being migrated to OpenCL farms. As they are cheaper in
  hardware costs.
 
  BTW. renderfarm.fi should be capable of running OpenCL as this is
  proposal is implemented!
 
 While I can believe that there will be dedicated online farms set up
 for this sort of thing I was more referring to farms in animation
 studios, most of which are not designed around GPU power - now, and
 nor probably for a while in the future. Even imagining if in the
 future blender uses openCL heavily, if a studio has not designed a
 farm specifically for blender (which is quite rare), CPU performance
 will continue to be very important. I'm curious how openCL translates
 to CPU multiprocessing performance, especially in comparison with
 using something like blender's existing pthread wrapper.
 
 cheers,
 
 Matt
 ___
 Bf-committers mailing list
 Bf-committers@blender.org
 http://lists.blender.org/mailman/listinfo/bf-committers
 
 
 
___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-14 Thread Knapp
 While some of the GPU based stuff nowadays looks very spectacular, I
 personally still feel hesitant - I don't think CPUs (and especially
 multiprocessing) should be left by the wayside. Not only due to the
 increasing prevalence of multicore systems nowadays, but also for
 render farms, which are very largely CPU based.

 cheers

 Matt

Yes, but for how long will that remain true??
http://www.tomshardware.com/news/nvda-china-super-computer-gpu,11545.html

Douglas E Knapp

Creative Commons Film Group, Helping people make open source movies
with open source software!
http://douglas.bespin.org/CommonsFilmGroup/phpBB3/index.php

Massage in Gelsenkirchen-Buer:
http://douglas.bespin.org/tcm/ztab1.htm
Please link to me and trade links with me!

Open Source Sci-Fi mmoRPG Game project.
http://sf-journey-creations.wikispot.org/Front_Page
http://code.google.com/p/perspectiveproject/
___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-14 Thread Xavier Thomas
Beside, OpenCL does not specially mean GPU. OpenCL can be executed by CPU
and even be accelerated by multiple CPU/Cores

2011/1/14 Knapp magick.c...@gmail.com

  While some of the GPU based stuff nowadays looks very spectacular, I
  personally still feel hesitant - I don't think CPUs (and especially
  multiprocessing) should be left by the wayside. Not only due to the
  increasing prevalence of multicore systems nowadays, but also for
  render farms, which are very largely CPU based.
 
  cheers
 
  Matt

 Yes, but for how long will that remain true??
 http://www.tomshardware.com/news/nvda-china-super-computer-gpu,11545.html

 Douglas E Knapp

 Creative Commons Film Group, Helping people make open source movies
 with open source software!
 http://douglas.bespin.org/CommonsFilmGroup/phpBB3/index.php

 Massage in Gelsenkirchen-Buer:
 http://douglas.bespin.org/tcm/ztab1.htm
 Please link to me and trade links with me!

 Open Source Sci-Fi mmoRPG Game project.
 http://sf-journey-creations.wikispot.org/Front_Page
 http://code.google.com/p/perspectiveproject/
 ___
 Bf-committers mailing list
 Bf-committers@blender.org
 http://lists.blender.org/mailman/listinfo/bf-committers

___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-14 Thread Roger Wickes From IPhone
And next gen cpus are incorporating the arch from what i read

Sent from my iPhone

On Jan 14, 2011, at 8:17 AM, Xavier Thomas xavier.thomas.1...@gmail.com wrote:

 Beside, OpenCL does not specially mean GPU. OpenCL can be executed by CPU
 and even be accelerated by multiple CPU/Cores
 
 2011/1/14 Knapp magick.c...@gmail.com
 
 While some of the GPU based stuff nowadays looks very spectacular, I
 personally still feel hesitant - I don't think CPUs (and especially
 multiprocessing) should be left by the wayside. Not only due to the
 increasing prevalence of multicore systems nowadays, but also for
 render farms, which are very largely CPU based.
 
 cheers
 
 Matt
 
 Yes, but for how long will that remain true??
 http://www.tomshardware.com/news/nvda-china-super-computer-gpu,11545.html
 
 Douglas E Knapp
 
 Creative Commons Film Group, Helping people make open source movies
 with open source software!
 http://douglas.bespin.org/CommonsFilmGroup/phpBB3/index.php
 
 Massage in Gelsenkirchen-Buer:
 http://douglas.bespin.org/tcm/ztab1.htm
 Please link to me and trade links with me!
 
 Open Source Sci-Fi mmoRPG Game project.
 http://sf-journey-creations.wikispot.org/Front_Page
 http://code.google.com/p/perspectiveproject/
 ___
 Bf-committers mailing list
 Bf-committers@blender.org
 http://lists.blender.org/mailman/listinfo/bf-committers
 
 ___
 Bf-committers mailing list
 Bf-committers@blender.org
 http://lists.blender.org/mailman/listinfo/bf-committers
___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-14 Thread Mike Pan
From a user's perspective, it seems a few node (blur, defocus, vector blur)
is responsible for over 90% of the node-compositing time in a real
production, accelerating these will probably have a far larger impact/effort
ratio than overhauling the entire framework.

also, sorry about the LinkedIn spam earlier today.

-mike pan


On Fri, Jan 14, 2011 at 11:27 AM, Roger Wickes From IPhone 
rogerwic...@yahoo.com wrote:

 And next gen cpus are incorporating the arch from what i read

 Sent from my iPhone

 On Jan 14, 2011, at 8:17 AM, Xavier Thomas xavier.thomas.1...@gmail.com
 wrote:

  Beside, OpenCL does not specially mean GPU. OpenCL can be executed by CPU
  and even be accelerated by multiple CPU/Cores
 
  2011/1/14 Knapp magick.c...@gmail.com
 
  While some of the GPU based stuff nowadays looks very spectacular, I
  personally still feel hesitant - I don't think CPUs (and especially
  multiprocessing) should be left by the wayside. Not only due to the
  increasing prevalence of multicore systems nowadays, but also for
  render farms, which are very largely CPU based.
 
  cheers
 
  Matt
 
  Yes, but for how long will that remain true??
 
 http://www.tomshardware.com/news/nvda-china-super-computer-gpu,11545.html
 
  Douglas E Knapp
 
  Creative Commons Film Group, Helping people make open source movies
  with open source software!
  http://douglas.bespin.org/CommonsFilmGroup/phpBB3/index.php
 
  Massage in Gelsenkirchen-Buer:
  http://douglas.bespin.org/tcm/ztab1.htm
  Please link to me and trade links with me!
 
  Open Source Sci-Fi mmoRPG Game project.
  http://sf-journey-creations.wikispot.org/Front_Page
  http://code.google.com/p/perspectiveproject/
  ___
  Bf-committers mailing list
  Bf-committers@blender.org
  http://lists.blender.org/mailman/listinfo/bf-committers
 
  ___
  Bf-committers mailing list
  Bf-committers@blender.org
  http://lists.blender.org/mailman/listinfo/bf-committers
 ___
 Bf-committers mailing list
 Bf-committers@blender.org
 http://lists.blender.org/mailman/listinfo/bf-committers

___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-14 Thread Matt Ebb
Thanks, Jeroen.

On Sat, Jan 15, 2011 at 6:04 PM, Jeroen Bakker j.bak...@atmind.nl wrote:
Farms
 are already being migrated to OpenCL farms. As they are cheaper in
 hardware costs.

 BTW. renderfarm.fi should be capable of running OpenCL as this is
 proposal is implemented!

While I can believe that there will be dedicated online farms set up
for this sort of thing I was more referring to farms in animation
studios, most of which are not designed around GPU power - now, and
nor probably for a while in the future. Even imagining if in the
future blender uses openCL heavily, if a studio has not designed a
farm specifically for blender (which is quite rare), CPU performance
will continue to be very important. I'm curious how openCL translates
to CPU multiprocessing performance, especially in comparison with
using something like blender's existing pthread wrapper.

cheers,

Matt
___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-13 Thread François T .
Hi,

I do think having openCL support is very cool for some nodes as I also
believe it should be build on strong foundation. And as we discussed at
several point, Blender Compositor structure is getting a bit old (Matt did
mention some of the low level issues here :
http://www.francois-tarlier.com/blog/blender-vfx-wish-list-features/)
All I'm saying is should so much effort should be put into it right now,
while some lower issue are still there to change. Or does that won't change
anything ?
Yes OpenCL will accelerate the compositor as it is, but isn't it a false
solution to a bigger problem ? if so, no matter what openCL will always make
it faster, I'm just afraid this will be taken as a solution solver to the
slow side of the compositor (which IMO it is not)
As a user I would prefer better performance, rather than faster (which
doesn't have to be the same thing).

Anyhow since I don't know much about it, it is just a reflexion.

thx

F

2011/1/12 Sean Olson seanol...@gmail.com

 You should probably put who you are and what your qualifications are on the
 donation site as well.   I stumbled on the site through a twitter link
 initially and had no idea who was doing the project.

 -Sean

 On Wed, Jan 12, 2011 at 10:07 AM, Jeroen Bakker j.bak...@atmind.nl
 wrote:

  Hi all,
 
  The last few months I have worked hard on a the proposal of the OpenCL
  based compositor. Currently the proposal is ready that it is clear how
  the solution should work and what the impact is. As the proposal is on
  the technical level the end-user won't feel a difference, except for a
  fast tile based compositor system. In functionality it should be the
 same.
 
  There are 2 aspects that will be solved:
   * Tiled based compositing
   * OpenCL compositing
 
  To implement these I will introduce additional components:
   * Tiled based memory manager
   * Node (pre-)compiler
   * Configurable automatically data conversion for compositor node systems
   * OpenCL driver manager
   * OpenCL configuration screen
   * Some debug information:
 * OpenCL program, performance etc.
 * Execution tree (including data types, resolution and kernelgrouping)
 * Visualizing tiles needed for calculation of an area.
 
  And introduce several new data types
   * Kernels and KernelGroup
   * Camera data type
   * Various color data types
 
  I have put all the documents on a project-website for review. As the
  proposal is quite long and complex. (all decisions are connected with
  each other.)
  Please use bf-committers or #blendercoders to discuss the proposal also
  if something is not clear.
 
  http://ocl.atmind.nl/doku.php?id=design:proposal:compositor-redesign
 
  Cheers,
  Jeroen Bakker
  ___
  Bf-committers mailing list
  Bf-committers@blender.org
  http://lists.blender.org/mailman/listinfo/bf-committers
 



 --
 ||-- Instant Messengers --
 || ICQ at 11133295
 || AIM at shatterstar98
 ||  MSN Messenger at shatte...@hotmail.com
 ||  Yahoo Y! at the_7th_samuri
 ||--
 ___
 Bf-committers mailing list
 Bf-committers@blender.org
 http://lists.blender.org/mailman/listinfo/bf-committers




-- 

François Tarlier
www.francois-tarlier.com
www.linkedin.com/in/francoistarlier
___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


[Bf-committers] Proposal: Blender OpenCL compositor

2011-01-12 Thread Jeroen Bakker
Hi all,

The last few months I have worked hard on a the proposal of the OpenCL 
based compositor. Currently the proposal is ready that it is clear how 
the solution should work and what the impact is. As the proposal is on 
the technical level the end-user won't feel a difference, except for a 
fast tile based compositor system. In functionality it should be the same.

There are 2 aspects that will be solved:
  * Tiled based compositing
  * OpenCL compositing

To implement these I will introduce additional components:
  * Tiled based memory manager
  * Node (pre-)compiler
  * Configurable automatically data conversion for compositor node systems
  * OpenCL driver manager
  * OpenCL configuration screen
  * Some debug information:
* OpenCL program, performance etc.
* Execution tree (including data types, resolution and kernelgrouping)
* Visualizing tiles needed for calculation of an area.

And introduce several new data types
  * Kernels and KernelGroup
  * Camera data type
  * Various color data types

I have put all the documents on a project-website for review. As the 
proposal is quite long and complex. (all decisions are connected with 
each other.)
Please use bf-committers or #blendercoders to discuss the proposal also 
if something is not clear.

http://ocl.atmind.nl/doku.php?id=design:proposal:compositor-redesign

Cheers,
Jeroen Bakker
___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers


Re: [Bf-committers] Proposal: Blender OpenCL compositor

2011-01-12 Thread Sean Olson
You should probably put who you are and what your qualifications are on the
donation site as well.   I stumbled on the site through a twitter link
initially and had no idea who was doing the project.

-Sean

On Wed, Jan 12, 2011 at 10:07 AM, Jeroen Bakker j.bak...@atmind.nl wrote:

 Hi all,

 The last few months I have worked hard on a the proposal of the OpenCL
 based compositor. Currently the proposal is ready that it is clear how
 the solution should work and what the impact is. As the proposal is on
 the technical level the end-user won't feel a difference, except for a
 fast tile based compositor system. In functionality it should be the same.

 There are 2 aspects that will be solved:
  * Tiled based compositing
  * OpenCL compositing

 To implement these I will introduce additional components:
  * Tiled based memory manager
  * Node (pre-)compiler
  * Configurable automatically data conversion for compositor node systems
  * OpenCL driver manager
  * OpenCL configuration screen
  * Some debug information:
* OpenCL program, performance etc.
* Execution tree (including data types, resolution and kernelgrouping)
* Visualizing tiles needed for calculation of an area.

 And introduce several new data types
  * Kernels and KernelGroup
  * Camera data type
  * Various color data types

 I have put all the documents on a project-website for review. As the
 proposal is quite long and complex. (all decisions are connected with
 each other.)
 Please use bf-committers or #blendercoders to discuss the proposal also
 if something is not clear.

 http://ocl.atmind.nl/doku.php?id=design:proposal:compositor-redesign

 Cheers,
 Jeroen Bakker
 ___
 Bf-committers mailing list
 Bf-committers@blender.org
 http://lists.blender.org/mailman/listinfo/bf-committers




-- 
||-- Instant Messengers --
|| ICQ at 11133295
|| AIM at shatterstar98
||  MSN Messenger at shatte...@hotmail.com
||  Yahoo Y! at the_7th_samuri
||--
___
Bf-committers mailing list
Bf-committers@blender.org
http://lists.blender.org/mailman/listinfo/bf-committers