Hi Denis,

I had a bit of luck with the following next() method:

        private int next() {
            // TODO: make function that convert from y value to bucket idx?
            int bucket = nextY - boundsMinY;
for (int ecur = edgeBuckets[bucket]; ecur != NULL; ecur = (int)edges[ecur+NEXT]) {
                edgePtrs = LilaHelpers.widenArray(edgePtrs, edgeCount, 1);
                edgePtrs[edgeCount++] = ecur;
                // REMIND: Adjust start Y if necessary
            }
            int crossingCount = edgeCount;
crossings = LilaHelpers.widenArray(crossings, 0, crossingCount);
            nextY++;
            for (int i = 0; i < edgeCount; i++) {
                int ecur = edgePtrs[i];
                float curx = edges[ecur+CURX];
                int cross = ((int) curx) << 1;
                edges[ecur+CURX] = curx + edges[ecur+SLOPE];
                if (edges[ecur+OR] > 0) {
                    cross |= 1;
                }
                int j = i;
                while (--j >= 0) {
                    int jcross = crossings[j];
                    if (jcross <= cross) {
                        break;
                    }
                    crossings[j+1] = jcross;
                    edgePtrs[j+1] = edgePtrs[j];
                }
                crossings[j+1] = cross;
                edgePtrs[j+1] = ecur;
            }
            int newCount = 0;
            for (int i = 0; i < edgeCount; i++) {
                int ecur = edgePtrs[i];
                if (edges[ecur+YMAX] > nextY) {
                    edgePtrs[newCount++] = ecur;
                }
            }
            edgeCount = newCount;
            return crossingCount;
        }

This allowed me to:

- delete a lot of the bucket helper functions.
- get rid of the back pointers
- pare an edge down to 5 values (YMAX, CURX, OR, SLOPE, and NEXT)

I also used the following addLine() method:

    private void addLine(float x1, float y1, float x2, float y2, int or) {
        final int firstCrossing = (int)Math.max(Math.ceil(y1), boundsMinY);
        if (firstCrossing >= boundsMaxY) {
            return;
        }
        final int ptr = numEdges * SIZEOF_EDGE;
        final float slope = (x2 - x1) / (y2 - y1);
        edges = LilaHelpers.widenArray(edges, ptr, SIZEOF_EDGE);
        numEdges++;
        edges[ptr+OR] = or;
        edges[ptr+CURX] = x1 + (firstCrossing - y1) * slope;
        edges[ptr+SLOPE] = slope;
        edges[ptr+YMAX] = y2;
        final int bucketIdx = firstCrossing - boundsMinY;
        addEdgeToBucket(ptr, bucketIdx);
    }

How does that fare for you?

                        ...jim

On 11/2/2010 4:10 PM, Denis Lila wrote:
Hi Jim.

I implemented a middle ground between what I sent yesterday and
what we have now: using the array of linked lists only to replace
the quicksort (I think this was your original suggestion).

Unfortunately, this didn't work out (at least according to the
benchmarks). Curves were no different than what we used to have,
while the performance on lines (especially horizontal ones) decreased
significantly.

It's not obvious to me why this happened, so I think now I will put
this type of optimization aside and convert to JNI, where profiling
will be easier (for me - I haven't been able to make OProfile work
for java yet).

Regards,
Denis.

----- "Jim Graham"<james.gra...@oracle.com>  wrote:

Hi Denis,

A generic suggestion - make all of your classes final - that gives the

compiler the maximum flexibility to inline any methods you write.

With respect to the algorithm choices:

I think they key is that the X sorting rarely has any work to do.  The

first test of "does this edge need to be swapped with the next lower
edge" is probably 99.999% guaranteed to be false.  Thus, trying to
optimize anything else to simplify swapping is likely a step in the
wrong direction.

The casting may be hurting a bit more, and it is hurting on every
access
to an edge.

I'm guessing the best model to use would be to write the code to
assume
no swapping is necessary at all.  Get that code as simple and as fast
as
can be.  Then add as little perturbation of that code to manage
swapping
when it is necessary, and that will likely be the optimal
implementation.  I think you could probably even do some benchmarking
on
a path that is carefully tested to process lots of edges without any X

sorting and get that as fast as you can without any swap code, and
then
add the swap code that impacts the performance of that operation as
little as possible.  The key is that the swap code have as little
impact
on the performance of the case that never needs any swapping at all
first and foremost - then make swapping faster within that
constraint...

                        ...jim

On 11/1/10 3:13 PM, Denis Lila wrote:
Hi Jim.

I implemented your bucket sort idea. I'm not just using the buckets
to remove the y-sort. I use them in the iteration through the
scanlines
too. What happens is that on any iteration, the active list is the
doubly linked list buckets[nextY-boundsMinY]. I did this because I
thought
less memory would need to be moved around compared to when we just
kept
the active list "pointers" in an array. For example, with doubly
linked
lists we can implement insertion sort with O(1) writes. With arrays
we
have to use O(n) writes. This also allows us to get rid of the the
edgePtrs
array.

I ran some benchmarks, and unfortunately I was wrong, somehwere. All
the tests
are at least 10% slower than the insertion sort version of what we
have now.
I can't immediately see why this is. The only thing I can think of
is that
there are a lot of float ->   int casts, but then again, I don't know
how
slow this operation is. It would be good if it's because of the
casts because
it would no longer be an issue when we convert to native.

I alo made an unrelated change: I now find the orientation and
update x0,y0
much earlier than we used to. The way I was doing it before was
silly.

Regards,
Denis.

----- "Jim Graham"<james.gra...@oracle.com>   wrote:

Hi Denis,

Good news!

On 10/28/2010 3:27 PM, Denis Lila wrote:
If we moved to a Curve class or some other way to
consolidate the 3 lists (may be easier in native code), this
might
win
in more cases...

Does that mean you no longer think we should flatten every curve
as
soon
as we get it?

No, I was just discussion the feasibility of that one suggestion
in
the
context of all of the possibilities of whether or not we took the
other
choices.  If you think that flattening will pay off then we don't
have

to worry about the 3 lists.  It was just that when I hacked it in,
I
still had your 3 lists and so the pros and cons of horizontal edge
sorting were relative to that version of the renderer...

                        ...jim

Reply via email to