Hi Denis,

A generic suggestion - make all of your classes final - that gives the compiler the maximum flexibility to inline any methods you write.

With respect to the algorithm choices:

I think they key is that the X sorting rarely has any work to do. The first test of "does this edge need to be swapped with the next lower edge" is probably 99.999% guaranteed to be false. Thus, trying to optimize anything else to simplify swapping is likely a step in the wrong direction.

The casting may be hurting a bit more, and it is hurting on every access to an edge.

I'm guessing the best model to use would be to write the code to assume no swapping is necessary at all. Get that code as simple and as fast as can be. Then add as little perturbation of that code to manage swapping when it is necessary, and that will likely be the optimal implementation. I think you could probably even do some benchmarking on a path that is carefully tested to process lots of edges without any X sorting and get that as fast as you can without any swap code, and then add the swap code that impacts the performance of that operation as little as possible. The key is that the swap code have as little impact on the performance of the case that never needs any swapping at all first and foremost - then make swapping faster within that constraint...

                        ...jim

On 11/1/10 3:13 PM, Denis Lila wrote:
Hi Jim.

I implemented your bucket sort idea. I'm not just using the buckets
to remove the y-sort. I use them in the iteration through the scanlines
too. What happens is that on any iteration, the active list is the
doubly linked list buckets[nextY-boundsMinY]. I did this because I thought
less memory would need to be moved around compared to when we just kept
the active list "pointers" in an array. For example, with doubly linked
lists we can implement insertion sort with O(1) writes. With arrays we
have to use O(n) writes. This also allows us to get rid of the the edgePtrs
array.

I ran some benchmarks, and unfortunately I was wrong, somehwere. All the tests
are at least 10% slower than the insertion sort version of what we have now.
I can't immediately see why this is. The only thing I can think of is that
there are a lot of float ->  int casts, but then again, I don't know how
slow this operation is. It would be good if it's because of the casts because
it would no longer be an issue when we convert to native.

I alo made an unrelated change: I now find the orientation and update x0,y0
much earlier than we used to. The way I was doing it before was silly.

Regards,
Denis.

----- "Jim Graham"<james.gra...@oracle.com>  wrote:

Hi Denis,

Good news!

On 10/28/2010 3:27 PM, Denis Lila wrote:
If we moved to a Curve class or some other way to
consolidate the 3 lists (may be easier in native code), this might
win
in more cases...

Does that mean you no longer think we should flatten every curve as
soon
as we get it?

No, I was just discussion the feasibility of that one suggestion in
the
context of all of the possibilities of whether or not we took the
other
choices.  If you think that flattening will pay off then we don't have

to worry about the 3 lists.  It was just that when I hacked it in, I
still had your 3 lists and so the pros and cons of horizontal edge
sorting were relative to that version of the renderer...

                        ...jim

Reply via email to