Re: [OpenJDK Rasterizer] RFR: Marlin renderer #2

Jim Graham Wed, 17 Jun 2015 13:45:05 -0700

Hi Laurent,

There were 1 or 2 very minor fixes I suggested in a couple of files thatyou indicated you were going to do, but I didn't see a new webrev. Isthere one? I think the existing array stuff is probably OK and I'lllook at it again now that I understand the tradeoffs of the currentdesign of it, but I wanted to know if there was a newer webrev for me toreview based on those other minor fixes?


On 6/12/15 5:45 AM, Laurent Bourgès wrote:

    A general thought - should we have foo_initial arrays if we are
    getting everything from a cache?

I adopted this approach to maximize performance for the 99% cases: I
sized properly the initial arrays and they takes ~ 512Kb per
RendererContext: the ArrayCache (Weak) is only used for extreme cases (a
bit slower). In march 2013, I started by using the cache for all array
accesses and it was 30% slower !

Was the 30% back when we were allocating or clearing all of the arrays?Now that we only clear what we use, the initial sizes of the arraysshouldn't matter so much. It would be worth a little extra sizeoverhead to not have so many special cases and special handling for the"_initial" arrays if the performance is now essentially identical.

I will look again at these cases: I agree "leaking refs" should be set
to null to be sure. However, Marlin code is always executing the pattern
init() / dispose() so leaking pointers are not critical but OK let's set
all of them to null.

For initial arrays, it seems redundant to set to null in dispose() and
set to initial in init() = useless and counter-productive !

It gets easier to review and maintain code if there is a clear reset tosome initial conditions is all. I would like to try to keep the code assimple and straightforward as possible while evaluating algorithmicimprovements without introducing potential bugs with a lot of specialcase code and then we can worry about the 1%'s later.

Also, we may find moving the code to native to be more of a performanceimprovement than trying to fight with Hotspot's load/store policies andso any code that tries to make "slightly more optimal byte codes" wouldend up not being needed and would add some noise to any such nativeconversion.

But, I do agree that technically you are correct about some of thesesuggestions leading to some provably redundant code. The issue is morethat constantly having to reprove the redundancies gets in the way ofsome other progress we could be making.

And, on the other hand, right now you are the engineer driving this fromthe workload point of view (and I apologize if I am inadvertently in arole that seems to be described as "the engineer getting in the way"),so I'm willing to rely on your commitment to be diligent about theseoptimizations. I'm just stating a case for whether they should be thefocus yet.

Sorry it is not a flaw but an advantage: the initial arrays that can
grow in widenXXXArray() have an odd size as indicated by the comment:
     // +1 to avoid recycling in Helpers.widenArray()
     private final int[] edgePtrs_initial  = new int[INITIAL_SMALL_ARRAY
+ 1]; // 4K

I saw that after the fact, but my main point remains that this is afragile inter-module optimization. I was unaware that it could save us30% to have special-cased "initial" arrays, and that fact makes thisrelationship that much more important but I wonder if that is stilltrue. If we ditch the initial arrays because the new "dirty array"management removes that performance penalty then this all goes away.And, even if we keep the initial array concept it would be nice to finda more reliable way to manage that than a side effect of its length.

I wanted to have a simple but robust API that can be easily used in Marlin.

I'd like to think that I'm working/promoting towards simple and robustAPIs as well.

I agree the documentation can be improved a lot and I could introduce a
new helper method in ArrayCache to create the initial arrays in a safer
and explicit way :

int[] createNonCachedIntArray(int length) {
// return only odd length to detect wrong cache usage
     if (length & 1 == 0) {
         length +=1;
     }
     return new int[length];
}

In a future effort, if we find that we must keep some small initialarrays for performance, we could make this a more explicit and tunableparameter on the caches. Let's table that until we consider if they buyus any performance after these dirty array changes.

Anyway I agree the cache handling can be improved a lot: try using
generics for array types with a lambda factory to allocate them.

I just wish that generics could handle primitive arrays, but they can't.I did a similar generification thing for the raster format conversioncode in FX, but I had to play games with the generic types beingByteBuffer and IntBuffer and having the variants of those accept theproper type of primitive arrays. It wasn't pretty, but it did work outin the end.

    The caching code is an example of "code with no obvious flaws
    because it is hard to figure out if it has flaws".  We should strive
    for "code with obviously no flaws because it is easy to verify that
    it is doing everything correctly".


I disagree the term "flaw" but I agree it is hard to figure out what is
done.

I took that from a talk on producing more secure code. I think itoriginally came from an overall system design that could easily beproven secure. The first release of Java required all sensitive Javacode to be constantly asking "am I in the context that this operation isallowed?" and the detection was fragile. It was replaced by code whereall sensitive operations were rejected unless the surrounding codeproved its protected context which was much more secure. "Disallowedunless requested" was harder to make a mistake than "allowed unless itlooks questionable".

But, that same philosophy serves a lot of code well. If you starthaving a bunch of "agreements" between modules then they can quickly getout of synch through regular maintenance and sometimes spiral out ofcontrol and you spend more time verifying that you're still satisfyingall of these conventions rather than letting the code make sure thingsare kosher because only valid operations are expressable.

OK, I can work later on a new framework:
- ArrayCache<K> abstract class where <K> is byte[], int[], float[]
- CleanArrayCache<K> extends ArrayCache<K> = clean arrays
- DirtyArrayCache<K> extends ArrayCache<K> = dirty arrays

I think I can trust the existing architecture for at least a putback ofthe last webrev that you sent out while we braingstorm a better cachingmechanism - let me make one last pass through it and I'll send an OKlater this afternoon.

What do you mean by "meking further real improvements in the algorithms" ?

Things like investigating DDA in the inner loops, alternate ways tomanage crossings that require less sorting, enhancing the stroker to donon-uniform scaling (which would eliminate the double transform in somecases), potentially passing the crossings to the output loop rather thanhaving to convert them into an alpha array, etc. A lot of that iseasier to review if there aren't a lot of extra "this line added 1%performance" code adding visual noise.

Who could help ? I am the only one working concretely on this code, help
wanted.

I'd love to get more people involved, but history tells us that this isa fairly lonely field. :(

I may have some time now that FX is in tightened bug-fix mode to do sometinkering, but mainly I will probably be much quicker on getting reviewsdone which may help as much as having another pair of hands in practice...


                        ...jim

Re: [OpenJDK Rasterizer] RFR: Marlin renderer #2

Reply via email to