On 10/26/2010 6:58 AM, Denis Lila wrote:
If we are really worried about the y-sort, then how about creating a
bunch of buckets and doing a bucket sort of the edges?  As they are
added to the list of segments, we accumulate their indices in a row
list based on their startY so that each step of the "next()" simply moves
to the next Y and adds the edges mentioned in the list there.  Some work
would have to be done on how to manage the storage simply (like a
"rownext" field in the edge structure so that they are linked in a
linked list), but then there would be no qsort at all for the cost of
new int[N_ROWS] and an extra field in every edge.

I like this.

It's even more frugal if combined with my idea that an entire full pixel row could be processed in one batch rather than processing each sub-pixel row separately.

90% (guesstimate) of the time edges do not cross each other, thus if
you sort the crossings without reordering the active edges then you just
end up doing the same sorting work (same swaps) on the next scanline.  My
SpanShapeIterator code actually reordered the edges on every sample
line to match their current X coordinates in a way that involved 1 compare
per edge that was processed and only occasionally a swap of 2 edge
pointers.  It would basically eliminate the sort at line 149 at the
cost of only doing a compare in the nextY processing for the very very
common case.

I also implemented this some time ago. I didn't keep it because it slowed
things down a bit. However, I only tested it with very complex and large
paths, and in hindsight, I shouldn't have based my decision on that, so I
will re-implement it.

First, this is less helpful if we process the edges/quads/curves in 3 separate batches because you'd still have 3 sets of crossings to interweave and so some sorting would still be necessary, but keeping each of the 3 sets sorted amongst themselves might still help with reducing the swapping work of the sorting step. This technique would work best if all of the edge types were processed from one list.

Also, be sure to test with both clockwise and couterclockwise versions of the test paths just in case. The (constantly re)sorting hit will be more expensive with one than the other since one of them will have the edges already in the sorted order and the other would require maximum swaps. I'd check:

Simple CW rectangle
Simple CCW rectangle
CW and CCW rectangles with 500(?) zig-zags along top edge
CW and CCW rectangles with 500(?) zig-zags along both top and bottom

And a note related to another suggestion I'd made, I think if we did something like "process an entire pixel row at once" it would complicate the "auto-edge sorting" a bit. If every curve is iterated to the bottom of the pixel in turn before moving to the next edge then its crossings may swap with another edge's crossings, but that next edge would not be processed yet and then there may be a mish-mosh of N subpixel values from the two edges to interweave. If it was done by processing each subpixel row in turn then the edges could be kept sorted as the subpixel rows were iterated, but it would reduce the savings on the overhead of swapping between edges. The best compromise might be to process every curve fully, verify its sorted position in the edge array using its X coordinate at the bottom of the pixel, still sort the whole pixel row's worth of crossing values, but hopefully the sorting would have very little work to do if the original values were "mostly sorted" by having kept the edges in order at pixel boundaries (a "mostly sorted" friendly sort algorithm, like insertion sort, might be needed).

                        ...jim

Reply via email to