I'm not a 3D expert but my "gut" tells me that the two pipelines should remain distinct as you say. I can't imagine the evolution of such different functions converging in such a way where the semantic treatment of the two will coincide in a clean, simple and unconfusing manner. That only seems like it would lead to compromise and the inability to develop both concepts to their full maturity - and what about what you mentioned regarding possible OpenGL exposure from the 3D API ? Would this be possible while still merging 2D and 3D semantics?
David On Jul 18, 2013, at 3:58 PM, Richard Bair <richard.b...@oracle.com> wrote: > While working on RT-5534, we found a large number of odd cases when mixing 2D > and 3D. Some of these we talked about previously, some either we hadn't or, > at least, they hadn't occurred to me. With 8 we are defining a lot of new API > for 3D, and we need to make sure that we've very clearly defined how 2D and > 3D nodes interact with each other, or developers will run into problems > frequently and fire off angry emails about it :-) > > Fundamentally, 2D and 3D rendering are completely different. There are > differences in how opacity is understood and applied. 2D graphics frequently > use clips, whereas 3D does not (other than clipping the view frustum or other > such environmental clipping). 2D uses things like filter effects (drop > shadow, etc) that is based on pixel bashing, whereas 3D uses light sources, > shaders, or other such techniques to cast shadows, implement fog, dynamic > lighting, etc. In short, 2D is fundamentally about drawing pixels and > blending using the Painters Algorithm, whereas 3D is about geometry and > shaders and (usually) a depth buffer. Of course 2D is almost always defined > as 0,0 in the top left, positive x to the right and positive y down, whereas > 3D is almost always 0,0 in the center, positive x to the right and positive y > up. But that's just a transform away, so I don't consider that a > *fundamental* difference. > > There are many ways in which these differences manifest themselves when > mixing content between the two graphics. > > http://fxexperience.com/?attachment_id=2853 > > This picture shows 4 circles and a rectangle. They are setup such that all 5 > shapes are in the same group [c1, c2, r, c3, c4]. However depthBuffer is > turned on (as well as perspective camera) so that I can use Z to position the > shapes instead of using the painter's algorithm. You will notice that the > first two circles (green and magenta) have a "dirty edge", whereas the last > two circles (blue and orange) look beautiful. Note that even though there is > a depth buffer involved, we're still issuing these shapes to the card in a > specific order. > > For those not familiar with the depth buffer, the way it works is very > simple. When you draw something, in addition to recording the RGBA values for > each pixel, you also write to an array (one element per pixel) with a value > for every non-transparent pixel that was touched. In this way, if you draw > something on top, and then draw something beneath it, the graphics card can > check the depth buffer to determine whether it should skip a pixel. So in the > image, we draw green for the green circle, and then later draw the black for > the rectangle, and because some pixels were already drawn to by the green > circle, the card knows not to overwrite those with the black pixel in the > background rectangle. > > The depth buffer is just a technique used to ensure that content rendered > respects Z for the order in which things appear composited in the final > frame. (You can individually cause nodes to ignore this requirement by > setting depthTest to false for a specific node or branch of the scene graph, > in which case they won't check with the depth buffer prior to drawing their > pixels, they'll just overwrite anything that was drawn previously, even if it > has a Z value that would put it behind the thing it is drawing over!). > > For the sake of this discussion "3D World" means "depth buffer enabled" and > assumes perspective camera is enabled, and 2D means "2.5D capable" by which I > mean perspective camera but no depth buffer. > > So: > > 1) Draw the first green circle. This is done by rendering the circle > into an image with nice anti-aliasing, and then rotating that image > and blend with anything already in the frame buffer > 2) Draw the magenta circle. Same as with green -- draw into an image > with nice AA and rotate and blend > 3) Draw the rectangle. Because the depth buffer is turned on, for each > pixel of the green & magenta circles, we *don't* render > any black. Because the AA edge has been touched with some > transparency, it was written to the depth buffer, and we will not > draw any black there. Hence the dirty fringe! No blending! > 4) Draw the blue circle into an image with nice AA, rotate, and blend. > AA edges are blended nicely with black background! > 5) Draw the orange circle into an image with nice AA, rotate, and > blend. AA edges are blended nicely with black background! > > Transparency in 3D is a problem, and on ES2 it is particularly difficult to > solve. As such, it is usually up to the application to sort their scene graph > nodes in such a way as to end up with something sensible. The difficulty in > this case is that when you use any 2D node and mix it in with 3D nodes (or > even other 2D nodes but with the depth buffer turned on) then you end up in a > situation where the nice AA ends up being a liability rather than an asset -- > unless you have manually sorted all your nodes in such a way as to avoid the > transparency problems. > > There are other problems. Suppose you create a scene where you have 3 > Rectangles, with Z values: > > r1.setTranslateZ(10); > r2.setTranslateZ(20); > r3.setTranslateZ(30); > > g1 = [r2, r3] > g2 = [g1, r1] > > If you have the depth buffer turned on, then you would expect that r1 is > drawn on top of r2, which is drawn on top of r3, regardless of the presence > of groups, because the order in which things are rendered is independent of > the order in which they appear, since we're using a depth buffer, so the Z > values are the only thing that really dictates the order in which things > appear. > > Now, something weird is going to happen if I either apply an effect, clip, > blendMode, or turn node caching on to g1. Because all 4 of these properties > are 2D properties that by their nature result in "flattening". That is, they > take the scene graph they've been given and render to an intermediate image, > and are then composited into the rest of the scene. In this case, since g1 > has no Z translation, what you would get is the combination of r2 and r3 > drawn on top of r1! We've flattened r2 and r3 into an image which is then > rendered at Z=0, which is above r1 with z=10. > > This behavior, although surprising, is consistent and correct. But it sure is > surprising for those, who like me, are traditional 2D developers coming to > the 3D world! > > Then there is the new support for scene anti-aliasing (presently using > multi-sampling, referred to as MSAA . In our 2D rendering, we always > anti-alias all shapes using a special set of shaders and grayscale masks > generated in software. This is a common technique and produces objectively > the best AA money can buy, often with the least overhead (the cost is in > generating and uploading the masks, which for most things we've optimized the > heck out of, though for paths you still will run into the worst case > scenarios). MSAA on the other hand, applies an algorithm against the entire > scene in order to produce "automatic" AA on everything (there are many ways > to do scene anti-aliasing. One way you can think of would be to draw to a > buffer 4x or 8x as large as necessary, and then scale it down using bilinear > scaling to 1x and put that on the screen, letting the image scaling algorithm > do the work). > > https://wiki.mozilla.org/images/4/48/Msaa_comparison.png > > Here you can see the smoothed edges of the monster. However MSAA does take > extra cycles and on resource constrained devices you may not want to do this > at all. In addition, it gives you worse AA than you would get with our mask / > shader approach for 2D shapes. > > Also, opacity. In 2D rendering contexts, using opacity means "render to an > image and apply the alpha blend to the entire image". This also inherently > means flattening. In 3D contexts, if you put an alpha on a Group, it should > mean "multiply this alpha with the alpha of each of my children > individually". This would always give the wrong result in 2D, but generally > the right one in 3D. And certainly better than flattening a group, which is > pretty much always a problem. > > So in summary, if you use 2D APIs in a 3D world (effect, clip, blendMode, > node caching) then you get surprising results. If you use a 2D shape in a 3D > world then the nice AA of 2D shapes may end up good or bad depending on the > render order relative to depth. And depending on whether you use a parallel > or perspective camera, using 3D shapes in a 2D world may end up quite > surprising as well. > > So what do I propose to do about this? Well, we can leave it be and just > document the heck out of it. Or we can try to tease apart the scene graph > into Node, Node3D, and NodeBase. Right now we're doing the former, and I've > tried the latter and it makes a mess in many places. We can talk about those > alternatives if you like, but to shorten (ahem) this message, I'm going to > just say it doesn't work (at least, it doesn't work well and may not work at > all) and leave it at that. > > Instead I propose that we keep the integrated scene graph as we have it, but > that we introduce two new classes, Scene3D and SubScene3D. These would be > configured specially in two ways. First, they would default to depthTest > enabled, scene antialiasing enabled, and perspective camera. Meanwhile, Scene > and SubScene would be configured for 2.5D by default, such that depthTest is > disabled, scene AA is disabled, and perspective camera is set. In this way, > if you rotate a 2.5D shape, you get perspective as you would expect, but none > of the other 3D behaviors. Scene3D and SubScene3D could also have y-up and > 0,0 in the center. > > Second, we will interpret the meaning of opacity differently depending on > whether you are in a Scene / SubScene, or a Scene3D / SubScene3D. Over time > we will also implement different semantics for rendering in both worlds. For > example, if you put a 2D rectangle in a Scene3D / SubScene3D, we would use a > quad to represent the rectangle and would not AA it at all, allowing the > scene3D's anti-aliasing property to define how to handle this. Likewise, a > complex path could either be tessellated or we could still use the mask + > shader approach to filling it, but that we would do so with no AA (so the > mask is black or white, not grayscale). > > If you use effects, clips, or blendModes we're going to flatten in the 3D > world as well. But since these are not common things to do in 3D, I find that > quite acceptable. Meanwhile in 3D we'll simply ignore the cache property > (since it is just a hint). > > So the idea is that we can have different pipelines optimized for 2D or 3D > rendering, and we will key-off which kind to use based on Scene / Scene3D, or > SubScene / SubScene3D. Shapes will look different depending on which world > they're rendered in, but that follows. All shapes (2D and 3D) will render by > the same rules in the 3D realm. > > Thoughts? > > Richard > >