Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
To the end of getting rid of "flips" - all they are used for is to resize the crossings array, right? This is not a huge array that costs a lot to resize - why not simply use a default array of, say, 30 elements and then resize it if we ever have more crossings than that? Only very complicated paths would have more than 30 crossings to track. The check for array length is only needed once per scanline since we know how many active edges are on each scan line (hi-lo) and you can only have 1 crossing per active edge so with one test per scanline we can keep the crossings array within range... ...jim Jim Graham wrote: Hi Denis, Well, I guess it's a good thing that Java2Demo had a path like that in it - not a very common case, so it's good we found it! The fix looks fine. It still seems like there is way more logic there than is needed - hopefully if we can get rid of flips at some point, much of it will go away. Fixes look good to go to me... ...jim On 8/5/2010 3:58 PM, Denis Lila wrote: Hi Jim. I didn't know about Java2Demo. If I did I would have run it sooner. But I ran it a few hours ago, and everything looked fine (surprisingly high fps too) until I got to the append test. Apparently I introduced a bug when solving the "2 consecutive moveTos bug". Basically, when there's a close() after a horizontal lineTo(), the lineTo in close() won't be executed because it's inside the if (firstOrientation != 0) test. So instead of going back to the starting point, close will stay where it is, which will draw a triangle above the rectangle. I fixed this by introducing a variable that keeps track of the last method called (lineTo, moveTo, or close), and instead of checking for firstOrientation != 0 in close(), I check for (last == LINE_TO). webrev (hopefully final): http://icedtea.classpath.org/~dlila/webrevs/fpBetterAAv2/webrev/ I'm sorry about this. I wish I had known about Java2Demo sooner. Thanks, Denis. - "Jim Graham" wrote: Hi Denis, That's great! I just did a last minute double-check of your last (final) webrevs to be sure. Have you tested Java2Demo with these changes? I'd also run any regression tests you can find with the changes. If there are no problems there, then you are good to go to push it... ...jim On 8/5/2010 8:08 AM, Denis Lila wrote: Hello. Are you a registered OpenJDK developer? I am now. Can I go ahead and push it? Regards, Denis.
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
Hi Denis, Well, I guess it's a good thing that Java2Demo had a path like that in it - not a very common case, so it's good we found it! The fix looks fine. It still seems like there is way more logic there than is needed - hopefully if we can get rid of flips at some point, much of it will go away. Fixes look good to go to me... ...jim On 8/5/2010 3:58 PM, Denis Lila wrote: Hi Jim. I didn't know about Java2Demo. If I did I would have run it sooner. But I ran it a few hours ago, and everything looked fine (surprisingly high fps too) until I got to the append test. Apparently I introduced a bug when solving the "2 consecutive moveTos bug". Basically, when there's a close() after a horizontal lineTo(), the lineTo in close() won't be executed because it's inside the if (firstOrientation != 0) test. So instead of going back to the starting point, close will stay where it is, which will draw a triangle above the rectangle. I fixed this by introducing a variable that keeps track of the last method called (lineTo, moveTo, or close), and instead of checking for firstOrientation != 0 in close(), I check for (last == LINE_TO). webrev (hopefully final): http://icedtea.classpath.org/~dlila/webrevs/fpBetterAAv2/webrev/ I'm sorry about this. I wish I had known about Java2Demo sooner. Thanks, Denis. - "Jim Graham" wrote: Hi Denis, That's great! I just did a last minute double-check of your last (final) webrevs to be sure. Have you tested Java2Demo with these changes? I'd also run any regression tests you can find with the changes. If there are no problems there, then you are good to go to push it... ...jim On 8/5/2010 8:08 AM, Denis Lila wrote: Hello. Are you a registered OpenJDK developer? I am now. Can I go ahead and push it? Regards, Denis.
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
Hi Jim. I didn't know about Java2Demo. If I did I would have run it sooner. But I ran it a few hours ago, and everything looked fine (surprisingly high fps too) until I got to the append test. Apparently I introduced a bug when solving the "2 consecutive moveTos bug". Basically, when there's a close() after a horizontal lineTo(), the lineTo in close() won't be executed because it's inside the if (firstOrientation != 0) test. So instead of going back to the starting point, close will stay where it is, which will draw a triangle above the rectangle. I fixed this by introducing a variable that keeps track of the last method called (lineTo, moveTo, or close), and instead of checking for firstOrientation != 0 in close(), I check for (last == LINE_TO). webrev (hopefully final): http://icedtea.classpath.org/~dlila/webrevs/fpBetterAAv2/webrev/ I'm sorry about this. I wish I had known about Java2Demo sooner. Thanks, Denis. - "Jim Graham" wrote: > Hi Denis, > > That's great! I just did a last minute double-check of your last > (final) webrevs to be sure. > > Have you tested Java2Demo with these changes? I'd also run any > regression tests you can find with the changes. If there are no > problems there, then you are good to go to push it... > > ...jim > > On 8/5/2010 8:08 AM, Denis Lila wrote: > > Hello. > > > >> Are you a registered OpenJDK developer? > > I am now. > > Can I go ahead and push it? > > > > Regards, > > Denis.
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
Hi Denis, That's great! I just did a last minute double-check of your last (final) webrevs to be sure. Have you tested Java2Demo with these changes? I'd also run any regression tests you can find with the changes. If there are no problems there, then you are good to go to push it... ...jim On 8/5/2010 8:08 AM, Denis Lila wrote: Hello. Are you a registered OpenJDK developer? I am now. Can I go ahead and push it? Regards, Denis.
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
Hello. > Are you a registered OpenJDK developer? I am now. Can I go ahead and push it? Regards, Denis.
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
Hello Jim. > I'd need to see a final webrev and approve it and then anyone with an > OpenJDK user id can push it. The final webrev is http://icedtea.classpath.org/~dlila/webrevs/fpBetterAAv2/webrev/ The only things that have changed since the last one you commented on are the "y == boundsMaxY - 1" check, and a few comments in Renderer.java. > I'll also have to create a bugid for the change set. I've already submitted a bug that this patch would fix. The ID is 6967436. > Are you a registered OpenJDK developer? Not just yet. I will be soon, I think. Thanks, Denis. - "Jim Graham" wrote: > Hi Denis, > > > ...jim > > Denis Lila wrote: > > Hello Jim. > > > >> I'm guessing the test for "y == boundsMaxY-1" at line 470 could > >> probably also be deleted now (since it will be handled by the end > >> test when y reaches maxY)? > > > > Indeed it can. I've done this, and also updated a few comments that > > might have been misleading (they were written before certain > changes > > were made). > > > >> It looks fine. Hopefully we can eventually figure out why the > sorting > >> on the fly didn't pan out. > > > > Wonderful! So, will it be committed? > > > > Thanks, > > Denis. > > > > - "Jim Graham" wrote: > > > >> Hi Denis, > >> > > > >> Denis Lila wrote: > >>> Hi Jim. > >>> > >>> Thanks for all your suggestions. I fixed the edge array > >> indexing > >>> issue, the moveTo bug (not assigning x0), and the double > >>> initialization issue. I also improved the emission of the last > row > >>> to what you said. The link is the same as the last one I sent. > > > >> But everything looks in order! > >> > >>...jim
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
Hi Denis, I'd need to see a final webrev and approve it and then anyone with an OpenJDK user id can push it. I'll also have to create a bugid for the change set. Are you a registered OpenJDK developer? ...jim Denis Lila wrote: Hello Jim. I'm guessing the test for "y == boundsMaxY-1" at line 470 could probably also be deleted now (since it will be handled by the end test when y reaches maxY)? Indeed it can. I've done this, and also updated a few comments that might have been misleading (they were written before certain changes were made). It looks fine. Hopefully we can eventually figure out why the sorting on the fly didn't pan out. Wonderful! So, will it be committed? Thanks, Denis. - "Jim Graham" wrote: Hi Denis, Denis Lila wrote: Hi Jim. Thanks for all your suggestions. I fixed the edge array indexing issue, the moveTo bug (not assigning x0), and the double initialization issue. I also improved the emission of the last row to what you said. The link is the same as the last one I sent. But everything looks in order! ...jim
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
Hello Jim. > I'm guessing the test for "y == boundsMaxY-1" at line 470 could > probably also be deleted now (since it will be handled by the end > test when y reaches maxY)? Indeed it can. I've done this, and also updated a few comments that might have been misleading (they were written before certain changes were made). > It looks fine. Hopefully we can eventually figure out why the sorting > on the fly didn't pan out. Wonderful! So, will it be committed? Thanks, Denis. - "Jim Graham" wrote: > Hi Denis, > > > Denis Lila wrote: > > Hi Jim. > > > > Thanks for all your suggestions. I fixed the edge array > indexing > > issue, the moveTo bug (not assigning x0), and the double > > initialization issue. I also improved the emission of the last row > > to what you said. The link is the same as the last one I sent. > > > But everything looks in order! > > ...jim
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
Hi Denis, It looks fine. Hopefully we can eventually figure out why the sorting on the fly didn't pan out. Denis Lila wrote: Hi Jim. Thanks for all your suggestions. I fixed the edge array indexing issue, the moveTo bug (not assigning x0), and the double initialization issue. I also improved the emission of the last row to what you said. The link is the same as the last one I sent. I'm guessing the test for "y == boundsMaxY-1" at line 470 could probably also be deleted now (since it will be handled by the end test when y reaches maxY)? But everything looks in order! ...jim
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
Hi Jim. Thanks for all your suggestions. I fixed the edge array indexing issue, the moveTo bug (not assigning x0), and the double initialization issue. I also improved the emission of the last row to what you said. The link is the same as the last one I sent. I considered one more optimization for the scan line iteration: instead of keeping a crossings array and Arrays.sort'ing it for every scanline, utilize the fact that the crossings information is already in the active list, and doesn't need to be copied, so we could just keep the active list sorted by increasing x crossing (this is the way things are done in ShapeSpanIterator, which is where I got the idea from). An added benefit of this (so I thought) would be that the order of the edges sorted by x crossing would not change very often, and not very much, in which case keeping it sorted by insertion sort would have (on average) linear time complexity per scan line, instead of nlogn which is what Arrays.sort has. After implementing this, I tested it, and unfortunately it turns out not to be as good as I expected. It is faster in some tests, but it is a lot slower in some others. It's a shame because it simplified things quite a bit (the file became just 440 lines). With it I was able to remove about 4 or 5 variables from Renderer, flips among them so we no longer needed to keep track of orientations, since we only do that to determine the size of the crossings array. > For now, it might be simpler to ignore this and revisit that later > since it isn't going to be common to have large such jumps in the > middle of most paths. I agree. Not only that, but even if there are gaps I don't think it will be much of a problem since very little work has to be done to traverse scan lines with no crossings. > Filling the alpha row. I had an interesting optimization here that I > > never got around to. Instead of filling the array entries with the > alpha values, fill them with the deltas of the alpha values. The > inner > loop in endRendering then becomes something like: > > alpha[pix_x0 ] += NUM_POS - (x0 & MASK); > alpha[pix_x0+1] += (x0 & MASK); > alpha[pix_x1 ] -= NUM_POS - (x1 & MASK); > alpha[pix_x1+1] -= (x1 & MASK); That's a good idea. I implemented it but I think we should use it whether it offers big savings or not. It's just a nicer algorithm. Although, the way it is now, it is a bit misleading (the name of the array is alpha). I should have commented this a bit more. I don't have time now, I'll do it on Monday. Thanks, Denis. - "Jim Graham" wrote: > Hi Denis, > > More thoughts on Renderer.java. > > -- Skipping gaps (minor optimization) -- > > If there is a gap in the edges in Y, say if a path consists of two > subpaths, one that covers y=[0..10] and another that covers > y=[1000..1010], then I think you will iterate over each y value from > 10 > to 1000, but have no work to do on each scan line. That could > possibly > waste a bit of time. > > On the other hand, fixing that would have to take into account whether > > or not you are done with a given alpha row, so the "NextY" function > can't simply skip - the skipping has to be done at the higher level in > > endRendering - or at least with the cooperation of endRendering. > Since > it asks what the "current Y" is near the top of the loop, it could > detect if the current Y jumped out of the given alpha row, emit it, > and > prepare for a new alpha row starting at the new Y...? > > > -- Done with skipping gaps -- > > -- Alpha accumulation opt -- > > > The [pix+1] lines were the gotcha that always confused me when I tried > > to do this in the past. Basically if you enter an inside region at 1 > > subpixel before the end of a pixel then you need to add one to the > coverage for that pixel. But, starting with the next pixel you want > the > total contribution of the interior region to be NUM_POS per pixel, but > > you've only added 1 so far - so you have to add the additional amount > so > that the total amount added for each "enter" crossing ends up being > NUM_POS. Similarly, when you decrement for the "exit" crossing you > need > to ensure that the total negative delta sums to NUM_POS across the > pixel > where the exit happens and the following pixel. Does that make > sense? > > Note that you could still have the "single pixel" optimization which > would look like: > > alpha[pix_x0 ] += (x1 - x0); > alpha[pix_x0+1] -= (x1 - x0); > > and is equivalent to the above 4 lines. > > then when you do the RLE, you simply have to use an accumulation as > you > scan the alpha row: > > byte nextVal = startVal + alphaRow[i]; > > So, for the cost of an add per pixel as you do the RLE you can avoid > having to do the loop in the "multiple pixel section" when filling the > > alpha array. > > I don't think I've ever quantified this optimization so it might be > better to investigate
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
Just to clarify. My second message that I just sent mostly contained some additional optimizations to consider for now or later, but this message below contained at least one (maybe 2) thing(s) that looked like a bug and a few maintenance issues that I think should be done before finalizing this set of changes... ...jim On 7/29/2010 5:27 PM, Jim Graham wrote: Hi Denis, The changes look fine and I'm moving on to Renderer.java... I'd make as many methods private as you can. It looks like all your new methods are private, but some old methods are still public even though I don't think they are accessed elsewhere (like addEdge()?). I think "private" is enough for Hotspot to inline so that should help performance a bit. I usually use "final", but I think "private" does the same thing. You should create constants for the indices into the struct to make the code more readable (and to simplify updating the EDGE layout): X0_OFF = 0 Y0_OFF = 1 // etc. I don't think you touch X0 and Y0 after initializing them and then using them for the first setCurY so you could just use them as the "curX" and "curY" and save 2 slots in the table. You initialize edges twice - once when it is declared, and once in the constructor. Note that the initialization where it is declared uses INIT_SIZE which is not a multiple of EDGE size anyway. It looks like there is a missing statement in moveTo to initialize this.x0...? I need some more time on it, but I thought I would send along these comments in the meantime... ...jim Denis Lila wrote: Hello Jim. I made the changes you point out, except for your second point (I don't have time to think about it right now). I stopped precomputing m00_2_m01_2 because it was only being used in one place. But I guess it worth it to precompute it if it's going to be used often (and computeOffset is called for every line). The new webrev is at http://icedtea.classpath.org/~dlila/webrevs/fpBetterAAv2/webrev/ Thanks, Denis.
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
Hi Denis, More thoughts on Renderer.java. -- Skipping gaps (minor optimization) -- If there is a gap in the edges in Y, say if a path consists of two subpaths, one that covers y=[0..10] and another that covers y=[1000..1010], then I think you will iterate over each y value from 10 to 1000, but have no work to do on each scan line. That could possibly waste a bit of time. On the other hand, fixing that would have to take into account whether or not you are done with a given alpha row, so the "NextY" function can't simply skip - the skipping has to be done at the higher level in endRendering - or at least with the cooperation of endRendering. Since it asks what the "current Y" is near the top of the loop, it could detect if the current Y jumped out of the given alpha row, emit it, and prepare for a new alpha row starting at the new Y...? For now, it might be simpler to ignore this and revisit that later since it isn't going to be common to have large such jumps in the middle of most paths. -- Done with skipping gaps -- -- Alpha accumulation opt -- Filling the alpha row. I had an interesting optimization here that I never got around to. Instead of filling the array entries with the alpha values, fill them with the deltas of the alpha values. The inner loop in endRendering then becomes something like: alpha[pix_x0 ] += NUM_POS - (x0 & MASK); alpha[pix_x0+1] += (x0 & MASK); alpha[pix_x1 ] -= NUM_POS - (x1 & MASK); alpha[pix_x1+1] -= (x1 & MASK); The [pix+1] lines were the gotcha that always confused me when I tried to do this in the past. Basically if you enter an inside region at 1 subpixel before the end of a pixel then you need to add one to the coverage for that pixel. But, starting with the next pixel you want the total contribution of the interior region to be NUM_POS per pixel, but you've only added 1 so far - so you have to add the additional amount so that the total amount added for each "enter" crossing ends up being NUM_POS. Similarly, when you decrement for the "exit" crossing you need to ensure that the total negative delta sums to NUM_POS across the pixel where the exit happens and the following pixel. Does that make sense? Note that you could still have the "single pixel" optimization which would look like: alpha[pix_x0 ] += (x1 - x0); alpha[pix_x0+1] -= (x1 - x0); and is equivalent to the above 4 lines. then when you do the RLE, you simply have to use an accumulation as you scan the alpha row: byte nextVal = startVal + alphaRow[i]; So, for the cost of an add per pixel as you do the RLE you can avoid having to do the loop in the "multiple pixel section" when filling the alpha array. I don't think I've ever quantified this optimization so it might be better to investigate it after the current code goes in and adopt it only if it shows a big savings. -- Done with alpha accumulation opt -- Something about the "emit last row" code bothers me. First, I would guess that the "y == boundsMaxY - 1" test would have already flushed the row, right? Also, why do the for loop? Why not just flush the row if it isn't flushed? I kind of feel like the "y == boundsMaxY-1" test could be removed from inside the loop and simply test if there is unflushed data in the alpha array then just make a call to emitRow() with the appropriate values after the loop. while (hasNext()) { ... if ((y & MASK) == MASK) { emitRow(...); pix_min,max = MAX,MIN; } } if (pix_min < pix_max) { emitRow(...); } Am I missing something? So, the upshot is that I didn't find anything wrong. You can take or leave my suggestions for improvements as you see fit and maybe save some of them for a second round after this goes back... ...jim On 7/29/2010 2:16 PM, Denis Lila wrote: Hello Jim. I made the changes you point out, except for your second point (I don't have time to think about it right now). I stopped precomputing m00_2_m01_2 because it was only being used in one place. But I guess it worth it to precompute it if it's going to be used often (and computeOffset is called for every line). The new webrev is at http://icedtea.classpath.org/~dlila/webrevs/fpBetterAAv2/webrev/ Thanks, Denis.
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
Hi Denis, The changes look fine and I'm moving on to Renderer.java... I'd make as many methods private as you can. It looks like all your new methods are private, but some old methods are still public even though I don't think they are accessed elsewhere (like addEdge()?). I think "private" is enough for Hotspot to inline so that should help performance a bit. I usually use "final", but I think "private" does the same thing. You should create constants for the indices into the struct to make the code more readable (and to simplify updating the EDGE layout): X0_OFF = 0 Y0_OFF = 1 // etc. I don't think you touch X0 and Y0 after initializing them and then using them for the first setCurY so you could just use them as the "curX" and "curY" and save 2 slots in the table. You initialize edges twice - once when it is declared, and once in the constructor. Note that the initialization where it is declared uses INIT_SIZE which is not a multiple of EDGE size anyway. It looks like there is a missing statement in moveTo to initialize this.x0...? I need some more time on it, but I thought I would send along these comments in the meantime... ...jim Denis Lila wrote: Hello Jim. I made the changes you point out, except for your second point (I don't have time to think about it right now). I stopped precomputing m00_2_m01_2 because it was only being used in one place. But I guess it worth it to precompute it if it's going to be used often (and computeOffset is called for every line). The new webrev is at http://icedtea.classpath.org/~dlila/webrevs/fpBetterAAv2/webrev/ Thanks, Denis.
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
Hello Jim. I made the changes you point out, except for your second point (I don't have time to think about it right now). I stopped precomputing m00_2_m01_2 because it was only being used in one place. But I guess it worth it to precompute it if it's going to be used often (and computeOffset is called for every line). The new webrev is at http://icedtea.classpath.org/~dlila/webrevs/fpBetterAAv2/webrev/ Thanks, Denis. - "Jim Graham" wrote: > Hi Denis, > > Only some minor comments so far: > > Stroker.java: > > - Should det be precomputed and saved? (You calculate it in the > constructor anyway, but don't save it.) > - Should test for uniform scale be precomputed? > - (Is test for uniform scale too strict? Can a rotated uniform scale > > use the same code as upright uniform scale?) > - Why are m00_2_m01_2 et al no longer precomputed (they only need to > be > precomputed if scale is not uniform which isn't common)? > - lineLength method is "user space length" isn't it? - a more > descriptive name might help avoid confusion if someone is modifying > code > here later. > - line 614 - missing space > > Dasher.java: > > - Line 187 - use "leftInThisDashSegment" here? > > I still have to look at Renderer.java in depth, but I thought I'd send > > these minor comments along while they are fresh in my email buffer... > > ...jim > > Denis Lila wrote: > > Hello. > > > > And, here it is: > http://icedtea.classpath.org/~dlila/webrevs/fpBetterAA/webrev/ > > > > Just as I thought, it's way faster than the original. I made a new > > test, which is supposed to be more realistic than drawing 3 > length > > lines. It consists of splitting a 1000x1000 frame in 100 10x10 > squares > > and in each square drawing 20 or so curves with random > start/end/control > > points in that square. In this case it's at least twice faster > (~0.85 > > versus ~1.95 seconds). > > > > Unfortunately I've had to do a lot of the same optimizations that I > was > > trying to remove (i.e. making everything a global variable). I > tried > > writing a higher level version of it, but it completely negated the > > performance gains of the new algorithm. Anyway, it's still not as > bad > > as before because the algorithm is inherently clearer and we only > iterate > > through scan lines instead of iterating through strips and then > scanlines > > in that strip, and then edges, all in the same method. > > > > It's also better organized, and logically separate parts of the code > don't > > really touch each other's variables much. And it's definitely > better > > commented. > > > > I also made some minor changes outside of Renderer that did not > appear in > > the first version of this patch last week: > > 1. I fixed the issue Jim pointed out in Dasher, where if > > (origLen == leftInThisDashSegment) the dash index was not being > incremented. > > 2. In dasher, I no longer copy the dash array. > > 3. I removed files PiscesMath.java and Transform4.java because they > are no > > longer needed. > > > > Thanks, > > Denis. > > > > - "Jim Graham" wrote: > > > >> Woohoo, Denis! I look forward to seeing the new version! > >> > >>...jim > >> > >> On 7/28/2010 5:51 AM, Denis Lila wrote: > >>> Hello Jim. > >>> > >>> This one performs almost identically to what is already there > >>> in openjdk6 and 7, since it's exactly what I sent for review > >>> last week, but with all the changes you suggested implemented. > >>> > >>> I would actually like to ask you to not look at either one of > >>> them. First of all, there is an ArrayIndexOutOfBoundsException > >>> possible in emitRow. > >>> And secondly, even if there wasn't, last night I implemented > >>> your algorithm from ShapeSpanIterator.c to iterate through > >>> the edges. I have yet to debug it, but it makes everything much, > >>> much simpler, and it should make it far faster, so we get the > best > >>> of both worlds. > >>> > >>> Thanks, > >>> Denis. > >>> > >>> - "Jim Graham" wrote: > >>> > Hi Denis, > > I'll try to get through both versions and see if I can find > >> anything > that was hurting performance with your EdgeLists. I'm guessing > >> that > this version was created because of the performance issues you > >> found > with the EdgeList version? Does this perform more closely to > the > existing code than the EdgeList version? > > ...jim > > Denis Lila wrote: > > Hello again. > > > > This attachmet is a "can of worms" implementation without all > the > fancy (and slow) > > iteration. It also includes all of the other suggestions you > sent > >> in > your first > > review of Dasher and Renderer last week (most importantly, the > firstOrientation > > issue, horizontal lines filtering, and adding prefixes to > >> variable > names to make > > it clear whether they refer to pixels, or subpixels). > > > >
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
Hi Denis, Only some minor comments so far: Stroker.java: - Should det be precomputed and saved? (You calculate it in the constructor anyway, but don't save it.) - Should test for uniform scale be precomputed? - (Is test for uniform scale too strict? Can a rotated uniform scale use the same code as upright uniform scale?) - Why are m00_2_m01_2 et al no longer precomputed (they only need to be precomputed if scale is not uniform which isn't common)? - lineLength method is "user space length" isn't it? - a more descriptive name might help avoid confusion if someone is modifying code here later. - line 614 - missing space Dasher.java: - Line 187 - use "leftInThisDashSegment" here? I still have to look at Renderer.java in depth, but I thought I'd send these minor comments along while they are fresh in my email buffer... ...jim Denis Lila wrote: Hello. And, here it is: http://icedtea.classpath.org/~dlila/webrevs/fpBetterAA/webrev/ Just as I thought, it's way faster than the original. I made a new test, which is supposed to be more realistic than drawing 3 length lines. It consists of splitting a 1000x1000 frame in 100 10x10 squares and in each square drawing 20 or so curves with random start/end/control points in that square. In this case it's at least twice faster (~0.85 versus ~1.95 seconds). Unfortunately I've had to do a lot of the same optimizations that I was trying to remove (i.e. making everything a global variable). I tried writing a higher level version of it, but it completely negated the performance gains of the new algorithm. Anyway, it's still not as bad as before because the algorithm is inherently clearer and we only iterate through scan lines instead of iterating through strips and then scanlines in that strip, and then edges, all in the same method. It's also better organized, and logically separate parts of the code don't really touch each other's variables much. And it's definitely better commented. I also made some minor changes outside of Renderer that did not appear in the first version of this patch last week: 1. I fixed the issue Jim pointed out in Dasher, where if (origLen == leftInThisDashSegment) the dash index was not being incremented. 2. In dasher, I no longer copy the dash array. 3. I removed files PiscesMath.java and Transform4.java because they are no longer needed. Thanks, Denis. - "Jim Graham" wrote: Woohoo, Denis! I look forward to seeing the new version! ...jim On 7/28/2010 5:51 AM, Denis Lila wrote: Hello Jim. This one performs almost identically to what is already there in openjdk6 and 7, since it's exactly what I sent for review last week, but with all the changes you suggested implemented. I would actually like to ask you to not look at either one of them. First of all, there is an ArrayIndexOutOfBoundsException possible in emitRow. And secondly, even if there wasn't, last night I implemented your algorithm from ShapeSpanIterator.c to iterate through the edges. I have yet to debug it, but it makes everything much, much simpler, and it should make it far faster, so we get the best of both worlds. Thanks, Denis. - "Jim Graham" wrote: Hi Denis, I'll try to get through both versions and see if I can find anything that was hurting performance with your EdgeLists. I'm guessing that this version was created because of the performance issues you found with the EdgeList version? Does this perform more closely to the existing code than the EdgeList version? ...jim Denis Lila wrote: Hello again. This attachmet is a "can of worms" implementation without all the fancy (and slow) iteration. It also includes all of the other suggestions you sent in your first review of Dasher and Renderer last week (most importantly, the firstOrientation issue, horizontal lines filtering, and adding prefixes to variable names to make it clear whether they refer to pixels, or subpixels). Regards, Denis. - "Denis Lila" wrote: Hello Jim. I implemented your "can of worms" idea. It works, and it got rid of the biasing. I wasn't able to send a webrev, but there are many changes and a side by side comparison would probably be useless, so I just attached the file. I hope this is ok. I also implemented a "better" iterating structure for the lines and the strips and crossings. I think it is better in every way, except performance. The new file is more than 200 lines smaller than the old one. The only members of Renderer are now the AA variables and the position variables (sx*, sy*, x*, y*). What I've done is I added an EdgeList class, which encapsulates all the edge related variables in the old Renderer. At first, I had an Edge class in addition to the EdgeList class, and while this was much nicer, it turned out to be too expensive (see last paragraph). I've also added a ScanLineIterator, so instead of _endRendering iterating through strips
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
Hello. And, here it is: http://icedtea.classpath.org/~dlila/webrevs/fpBetterAA/webrev/ Just as I thought, it's way faster than the original. I made a new test, which is supposed to be more realistic than drawing 3 length lines. It consists of splitting a 1000x1000 frame in 100 10x10 squares and in each square drawing 20 or so curves with random start/end/control points in that square. In this case it's at least twice faster (~0.85 versus ~1.95 seconds). Unfortunately I've had to do a lot of the same optimizations that I was trying to remove (i.e. making everything a global variable). I tried writing a higher level version of it, but it completely negated the performance gains of the new algorithm. Anyway, it's still not as bad as before because the algorithm is inherently clearer and we only iterate through scan lines instead of iterating through strips and then scanlines in that strip, and then edges, all in the same method. It's also better organized, and logically separate parts of the code don't really touch each other's variables much. And it's definitely better commented. I also made some minor changes outside of Renderer that did not appear in the first version of this patch last week: 1. I fixed the issue Jim pointed out in Dasher, where if (origLen == leftInThisDashSegment) the dash index was not being incremented. 2. In dasher, I no longer copy the dash array. 3. I removed files PiscesMath.java and Transform4.java because they are no longer needed. Thanks, Denis. - "Jim Graham" wrote: > Woohoo, Denis! I look forward to seeing the new version! > > ...jim > > On 7/28/2010 5:51 AM, Denis Lila wrote: > > Hello Jim. > > > > This one performs almost identically to what is already there > > in openjdk6 and 7, since it's exactly what I sent for review > > last week, but with all the changes you suggested implemented. > > > > I would actually like to ask you to not look at either one of > > them. First of all, there is an ArrayIndexOutOfBoundsException > > possible in emitRow. > > And secondly, even if there wasn't, last night I implemented > > your algorithm from ShapeSpanIterator.c to iterate through > > the edges. I have yet to debug it, but it makes everything much, > > much simpler, and it should make it far faster, so we get the best > > of both worlds. > > > > Thanks, > > Denis. > > > > - "Jim Graham" wrote: > > > >> Hi Denis, > >> > >> I'll try to get through both versions and see if I can find > anything > >> that was hurting performance with your EdgeLists. I'm guessing > that > >> this version was created because of the performance issues you > found > >> with the EdgeList version? Does this perform more closely to the > >> existing code than the EdgeList version? > >> > >>...jim > >> > >> Denis Lila wrote: > >>> Hello again. > >>> > >>> This attachmet is a "can of worms" implementation without all the > >> fancy (and slow) > >>> iteration. It also includes all of the other suggestions you sent > in > >> your first > >>> review of Dasher and Renderer last week (most importantly, the > >> firstOrientation > >>> issue, horizontal lines filtering, and adding prefixes to > variable > >> names to make > >>> it clear whether they refer to pixels, or subpixels). > >>> > >>> Regards, > >>> Denis. > >>> > >>> - "Denis Lila" wrote: > >>> > Hello Jim. > > I implemented your "can of worms" idea. It works, and it got rid > >> of > the biasing. > I wasn't able to send a webrev, but there are many changes and a > >> side > by side > comparison would probably be useless, so I just attached the > file. > >> I > hope this is > ok. > > I also implemented a "better" iterating structure for the lines > >> and > the > strips and crossings. I think it is better in every way, except > performance. > The new file is more than 200 lines smaller than the old one. > The > >> only > members of Renderer are now the AA variables and the position > variables > (sx*, sy*, x*, y*). > What I've done is I added an EdgeList class, which encapsulates > >> all > the edge > related variables in the old Renderer. At first, I had an Edge > >> class > in addition > to the EdgeList class, and while this was much nicer, it turned > out > >> to > be too > expensive (see last paragraph). > I've also added a ScanLineIterator, so instead of _endRendering > iterating > through strips, and then calling renderStrip() which iterates > >> through > the > scanlines in that strip, and then through the crossings in that > scanline, > what happens now is that _endRendering uses the > Iterator > >> to > iterate through each scanline, get get its crossings and iterate > through them > to accumulate the alpha. By the way, a ScanLine is a type > defined > >> by > an > interface which exports methods for get
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
Woohoo, Denis! I look forward to seeing the new version! ...jim On 7/28/2010 5:51 AM, Denis Lila wrote: Hello Jim. This one performs almost identically to what is already there in openjdk6 and 7, since it's exactly what I sent for review last week, but with all the changes you suggested implemented. I would actually like to ask you to not look at either one of them. First of all, there is an ArrayIndexOutOfBoundsException possible in emitRow. And secondly, even if there wasn't, last night I implemented your algorithm from ShapeSpanIterator.c to iterate through the edges. I have yet to debug it, but it makes everything much, much simpler, and it should make it far faster, so we get the best of both worlds. Thanks, Denis. - "Jim Graham" wrote: Hi Denis, I'll try to get through both versions and see if I can find anything that was hurting performance with your EdgeLists. I'm guessing that this version was created because of the performance issues you found with the EdgeList version? Does this perform more closely to the existing code than the EdgeList version? ...jim Denis Lila wrote: Hello again. This attachmet is a "can of worms" implementation without all the fancy (and slow) iteration. It also includes all of the other suggestions you sent in your first review of Dasher and Renderer last week (most importantly, the firstOrientation issue, horizontal lines filtering, and adding prefixes to variable names to make it clear whether they refer to pixels, or subpixels). Regards, Denis. - "Denis Lila" wrote: Hello Jim. I implemented your "can of worms" idea. It works, and it got rid of the biasing. I wasn't able to send a webrev, but there are many changes and a side by side comparison would probably be useless, so I just attached the file. I hope this is ok. I also implemented a "better" iterating structure for the lines and the strips and crossings. I think it is better in every way, except performance. The new file is more than 200 lines smaller than the old one. The only members of Renderer are now the AA variables and the position variables (sx*, sy*, x*, y*). What I've done is I added an EdgeList class, which encapsulates all the edge related variables in the old Renderer. At first, I had an Edge class in addition to the EdgeList class, and while this was much nicer, it turned out to be too expensive (see last paragraph). I've also added a ScanLineIterator, so instead of _endRendering iterating through strips, and then calling renderStrip() which iterates through the scanlines in that strip, and then through the crossings in that scanline, what happens now is that _endRendering uses the Iterator to iterate through each scanline, get get its crossings and iterate through them to accumulate the alpha. By the way, a ScanLine is a type defined by an interface which exports methods for getting the y coord of the line, the number of crossings in it, the ith crossing, and a method for sorting its crossings. The class that implements ScanLine is ScanLineIterator itself. I made a ScanLine class, but I was afraid performance would suffer because of all the object creations (this turned out not to be an issue, after I switched to the current way, and remeasured things). I did not switch back because this is only slightly worse. As for performance: I wrote a simple program that tries to draw a dashed path that consists of about 160 dashed lines of width 1 and length 3, going from the centre of the frame to some point. On my machine, this takes about 4.9 seconds in openjdk6, and 26 seconds using the attached file. Back when I was using the Edge class it took about 39 seconds. Everything without hundres of thousands of edges is not much slower I have not changed any of the algorithms. ScanLineIterator still goes through strips of the same size and computes crossings in every strip using the same method as before, so I don't know why it's so slow. It can't be because of anything happening in _endRendering, because there are only about 9000 scanlines and for each of them I've just added a few calls to one line getters (which used to be direct accesses into arrays). Thanks, Denis. - "Jim Graham" wrote: Denis Lila wrote: Hello Jim. Thank you very much for taking the time to read through this. 169 - if origLen reaches the end of the dash exactly (the "==" case) You're right, I should. I can't just replace<= with == though, because the results will be the same: in the equal case origLen will become 0, and on the next iteration, the (origLen< dash[idx]-phase) will be true, and we will do a goTo(x1,y1), which is what we just did in the previous iteration (unless dash[idx] is 0, in which case the results will be even worse). The best solution to this is to just do a nested check for the == case. Ah, right - because there is no "break" when origLen becomes zero.
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
Hello Jim. This one performs almost identically to what is already there in openjdk6 and 7, since it's exactly what I sent for review last week, but with all the changes you suggested implemented. I would actually like to ask you to not look at either one of them. First of all, there is an ArrayIndexOutOfBoundsException possible in emitRow. And secondly, even if there wasn't, last night I implemented your algorithm from ShapeSpanIterator.c to iterate through the edges. I have yet to debug it, but it makes everything much, much simpler, and it should make it far faster, so we get the best of both worlds. Thanks, Denis. - "Jim Graham" wrote: > Hi Denis, > > I'll try to get through both versions and see if I can find anything > that was hurting performance with your EdgeLists. I'm guessing that > this version was created because of the performance issues you found > with the EdgeList version? Does this perform more closely to the > existing code than the EdgeList version? > > ...jim > > Denis Lila wrote: > > Hello again. > > > > This attachmet is a "can of worms" implementation without all the > fancy (and slow) > > iteration. It also includes all of the other suggestions you sent in > your first > > review of Dasher and Renderer last week (most importantly, the > firstOrientation > > issue, horizontal lines filtering, and adding prefixes to variable > names to make > > it clear whether they refer to pixels, or subpixels). > > > > Regards, > > Denis. > > > > - "Denis Lila" wrote: > > > >> Hello Jim. > >> > >> I implemented your "can of worms" idea. It works, and it got rid > of > >> the biasing. > >> I wasn't able to send a webrev, but there are many changes and a > side > >> by side > >> comparison would probably be useless, so I just attached the file. > I > >> hope this is > >> ok. > >> > >> I also implemented a "better" iterating structure for the lines > and > >> the > >> strips and crossings. I think it is better in every way, except > >> performance. > >> The new file is more than 200 lines smaller than the old one. The > only > >> members of Renderer are now the AA variables and the position > >> variables > >> (sx*, sy*, x*, y*). > >> What I've done is I added an EdgeList class, which encapsulates > all > >> the edge > >> related variables in the old Renderer. At first, I had an Edge > class > >> in addition > >> to the EdgeList class, and while this was much nicer, it turned out > to > >> be too > >> expensive (see last paragraph). > >> I've also added a ScanLineIterator, so instead of _endRendering > >> iterating > >> through strips, and then calling renderStrip() which iterates > through > >> the > >> scanlines in that strip, and then through the crossings in that > >> scanline, > >> what happens now is that _endRendering uses the Iterator > to > >> iterate through each scanline, get get its crossings and iterate > >> through them > >> to accumulate the alpha. By the way, a ScanLine is a type defined > by > >> an > >> interface which exports methods for getting the y coord of the > line, > >> the > >> number of crossings in it, the ith crossing, and a method for > sorting > >> its crossings. > >> The class that implements ScanLine is ScanLineIterator itself. I > made > >> a > >> ScanLine class, but I was afraid performance would suffer because > of > >> all the > >> object creations (this turned out not to be an issue, after I > switched > >> to the > >> current way, and remeasured things). I did not switch back because > >> this is > >> only slightly worse. > >> > >> As for performance: I wrote a simple program that tries to draw a > >> dashed path > >> that consists of about 160 dashed lines of width 1 and length > 3, > >> going > >> from the centre of the frame to some point. On my machine, this > takes > >> about 4.9 > >> seconds in openjdk6, and 26 seconds using the attached file. Back > when > >> I was using > >> the Edge class it took about 39 seconds. Everything without hundres > of > >> thousands of > >> edges is not much slower > >> I have not changed any of the algorithms. ScanLineIterator still > goes > >> through > >> strips of the same size and computes crossings in every strip > using > >> the same > >> method as before, so I don't know why it's so slow. It can't be > >> because of anything > >> happening in _endRendering, because there are only about 9000 > >> scanlines and for each > >> of them I've just added a few calls to one line getters (which used > to > >> be direct > >> accesses into arrays). > >> > >> Thanks, > >> Denis. > >> > >> - "Jim Graham" wrote: > >> > >>> Denis Lila wrote: > Hello Jim. > > Thank you very much for taking the time to read through this. > > > 169 - if origLen reaches the end of the dash exactly (the "==" > >>> case) > You're right, I should. I can't just replace <= with == > >> though, > because the results will be the same: in the equal case origLen >
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
Hi Denis, I'll try to get through both versions and see if I can find anything that was hurting performance with your EdgeLists. I'm guessing that this version was created because of the performance issues you found with the EdgeList version? Does this perform more closely to the existing code than the EdgeList version? ...jim Denis Lila wrote: Hello again. This attachmet is a "can of worms" implementation without all the fancy (and slow) iteration. It also includes all of the other suggestions you sent in your first review of Dasher and Renderer last week (most importantly, the firstOrientation issue, horizontal lines filtering, and adding prefixes to variable names to make it clear whether they refer to pixels, or subpixels). Regards, Denis. - "Denis Lila" wrote: Hello Jim. I implemented your "can of worms" idea. It works, and it got rid of the biasing. I wasn't able to send a webrev, but there are many changes and a side by side comparison would probably be useless, so I just attached the file. I hope this is ok. I also implemented a "better" iterating structure for the lines and the strips and crossings. I think it is better in every way, except performance. The new file is more than 200 lines smaller than the old one. The only members of Renderer are now the AA variables and the position variables (sx*, sy*, x*, y*). What I've done is I added an EdgeList class, which encapsulates all the edge related variables in the old Renderer. At first, I had an Edge class in addition to the EdgeList class, and while this was much nicer, it turned out to be too expensive (see last paragraph). I've also added a ScanLineIterator, so instead of _endRendering iterating through strips, and then calling renderStrip() which iterates through the scanlines in that strip, and then through the crossings in that scanline, what happens now is that _endRendering uses the Iterator to iterate through each scanline, get get its crossings and iterate through them to accumulate the alpha. By the way, a ScanLine is a type defined by an interface which exports methods for getting the y coord of the line, the number of crossings in it, the ith crossing, and a method for sorting its crossings. The class that implements ScanLine is ScanLineIterator itself. I made a ScanLine class, but I was afraid performance would suffer because of all the object creations (this turned out not to be an issue, after I switched to the current way, and remeasured things). I did not switch back because this is only slightly worse. As for performance: I wrote a simple program that tries to draw a dashed path that consists of about 160 dashed lines of width 1 and length 3, going from the centre of the frame to some point. On my machine, this takes about 4.9 seconds in openjdk6, and 26 seconds using the attached file. Back when I was using the Edge class it took about 39 seconds. Everything without hundres of thousands of edges is not much slower I have not changed any of the algorithms. ScanLineIterator still goes through strips of the same size and computes crossings in every strip using the same method as before, so I don't know why it's so slow. It can't be because of anything happening in _endRendering, because there are only about 9000 scanlines and for each of them I've just added a few calls to one line getters (which used to be direct accesses into arrays). Thanks, Denis. - "Jim Graham" wrote: Denis Lila wrote: Hello Jim. Thank you very much for taking the time to read through this. 169 - if origLen reaches the end of the dash exactly (the "==" case) You're right, I should. I can't just replace <= with == though, because the results will be the same: in the equal case origLen will become 0, and on the next iteration, the (origLen < dash[idx]-phase) will be true, and we will do a goTo(x1,y1), which is what we just did in the previous iteration (unless dash[idx] is 0, in which case the results will be even worse). The best solution to this is to just do a nested check for the == case. Ah, right - because there is no "break" when origLen becomes zero. Sounds like you're on it. 171 - Aren't x0,y0 stored as subpix values? You would then be comparing a subpix value to a non-subpix value. Perhaps if the subpix calls are moved to the top of the function I think this should work OK? That's true, they are. This is very puzzling. If a horizontal line is added, when the crossings for it are being computed, dxBydy should be NaN, and wouldn't an error be thrown when we try to cast to an int in the call to addCrossing? I'm not sure - I didn't trace it through very far - I just noted that the values were likely in different "resolutions". 194,197 - Shouldn't these be constants, or based on the SUB_POS_XY? I suppose I should make a biasing constant. I don't think they should be based on SUB_POS_XY though, because the biasing is done to subpixel coor
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
Hello again. This attachmet is a "can of worms" implementation without all the fancy (and slow) iteration. It also includes all of the other suggestions you sent in your first review of Dasher and Renderer last week (most importantly, the firstOrientation issue, horizontal lines filtering, and adding prefixes to variable names to make it clear whether they refer to pixels, or subpixels). Regards, Denis. - "Denis Lila" wrote: > Hello Jim. > > I implemented your "can of worms" idea. It works, and it got rid of > the biasing. > I wasn't able to send a webrev, but there are many changes and a side > by side > comparison would probably be useless, so I just attached the file. I > hope this is > ok. > > I also implemented a "better" iterating structure for the lines and > the > strips and crossings. I think it is better in every way, except > performance. > The new file is more than 200 lines smaller than the old one. The only > members of Renderer are now the AA variables and the position > variables > (sx*, sy*, x*, y*). > What I've done is I added an EdgeList class, which encapsulates all > the edge > related variables in the old Renderer. At first, I had an Edge class > in addition > to the EdgeList class, and while this was much nicer, it turned out to > be too > expensive (see last paragraph). > I've also added a ScanLineIterator, so instead of _endRendering > iterating > through strips, and then calling renderStrip() which iterates through > the > scanlines in that strip, and then through the crossings in that > scanline, > what happens now is that _endRendering uses the Iterator to > iterate through each scanline, get get its crossings and iterate > through them > to accumulate the alpha. By the way, a ScanLine is a type defined by > an > interface which exports methods for getting the y coord of the line, > the > number of crossings in it, the ith crossing, and a method for sorting > its crossings. > The class that implements ScanLine is ScanLineIterator itself. I made > a > ScanLine class, but I was afraid performance would suffer because of > all the > object creations (this turned out not to be an issue, after I switched > to the > current way, and remeasured things). I did not switch back because > this is > only slightly worse. > > As for performance: I wrote a simple program that tries to draw a > dashed path > that consists of about 160 dashed lines of width 1 and length 3, > going > from the centre of the frame to some point. On my machine, this takes > about 4.9 > seconds in openjdk6, and 26 seconds using the attached file. Back when > I was using > the Edge class it took about 39 seconds. Everything without hundres of > thousands of > edges is not much slower > I have not changed any of the algorithms. ScanLineIterator still goes > through > strips of the same size and computes crossings in every strip using > the same > method as before, so I don't know why it's so slow. It can't be > because of anything > happening in _endRendering, because there are only about 9000 > scanlines and for each > of them I've just added a few calls to one line getters (which used to > be direct > accesses into arrays). > > Thanks, > Denis. > > - "Jim Graham" wrote: > > > Denis Lila wrote: > > > Hello Jim. > > > > > > Thank you very much for taking the time to read through this. > > > > > >> 169 - if origLen reaches the end of the dash exactly (the "==" > > case) > > > > > > You're right, I should. I can't just replace <= with == > though, > > > because the results will be the same: in the equal case origLen > > will > > > become 0, and on the next iteration, the (origLen < > dash[idx]-phase) > > will > > > be true, and we will do a goTo(x1,y1), which is what we just did > in > > the > > > previous iteration (unless dash[idx] is 0, in which case the > results > > > > > will be even worse). The best solution to this is to just do a > > nested > > > check for the == case. > > > > Ah, right - because there is no "break" when origLen becomes zero. > > Sounds like you're on it. > > > > >> 171 - Aren't x0,y0 stored as subpix values? You would then be > > comparing > > >> a subpix value to a non-subpix value. Perhaps if the subpix > calls > > are > > >> moved to the top of the function I think this should work OK? > > > > > > That's true, they are. This is very puzzling. If a horizontal > > line is > > > added, when the crossings for it are being computed, dxBydy should > > be NaN, and > > > wouldn't an error be thrown when we try to cast to an int in the > > call to addCrossing? > > > > I'm not sure - I didn't trace it through very far - I just noted > that > > > > the values were likely in different "resolutions". > > > > >> 194,197 - Shouldn't these be constants, or based on the > > SUB_POS_XY? > > > > > > I suppose I should make a biasing constant. I don't think they > > should be based > > > on SUB_POS_XY though, because the biasing is done to subpixel > > coordinates so > > >
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
Hello Jim. I implemented your "can of worms" idea. It works, and it got rid of the biasing. I wasn't able to send a webrev, but there are many changes and a side by side comparison would probably be useless, so I just attached the file. I hope this is ok. I also implemented a "better" iterating structure for the lines and the strips and crossings. I think it is better in every way, except performance. The new file is more than 200 lines smaller than the old one. The only members of Renderer are now the AA variables and the position variables (sx*, sy*, x*, y*). What I've done is I added an EdgeList class, which encapsulates all the edge related variables in the old Renderer. At first, I had an Edge class in addition to the EdgeList class, and while this was much nicer, it turned out to be too expensive (see last paragraph). I've also added a ScanLineIterator, so instead of _endRendering iterating through strips, and then calling renderStrip() which iterates through the scanlines in that strip, and then through the crossings in that scanline, what happens now is that _endRendering uses the Iterator to iterate through each scanline, get get its crossings and iterate through them to accumulate the alpha. By the way, a ScanLine is a type defined by an interface which exports methods for getting the y coord of the line, the number of crossings in it, the ith crossing, and a method for sorting its crossings. The class that implements ScanLine is ScanLineIterator itself. I made a ScanLine class, but I was afraid performance would suffer because of all the object creations (this turned out not to be an issue, after I switched to the current way, and remeasured things). I did not switch back because this is only slightly worse. As for performance: I wrote a simple program that tries to draw a dashed path that consists of about 160 dashed lines of width 1 and length 3, going from the centre of the frame to some point. On my machine, this takes about 4.9 seconds in openjdk6, and 26 seconds using the attached file. Back when I was using the Edge class it took about 39 seconds. Everything without hundres of thousands of edges is not much slower I have not changed any of the algorithms. ScanLineIterator still goes through strips of the same size and computes crossings in every strip using the same method as before, so I don't know why it's so slow. It can't be because of anything happening in _endRendering, because there are only about 9000 scanlines and for each of them I've just added a few calls to one line getters (which used to be direct accesses into arrays). Thanks, Denis. - "Jim Graham" wrote: > Denis Lila wrote: > > Hello Jim. > > > > Thank you very much for taking the time to read through this. > > > >> 169 - if origLen reaches the end of the dash exactly (the "==" > case) > > > > You're right, I should. I can't just replace <= with == though, > > because the results will be the same: in the equal case origLen > will > > become 0, and on the next iteration, the (origLen < dash[idx]-phase) > will > > be true, and we will do a goTo(x1,y1), which is what we just did in > the > > previous iteration (unless dash[idx] is 0, in which case the results > > > will be even worse). The best solution to this is to just do a > nested > > check for the == case. > > Ah, right - because there is no "break" when origLen becomes zero. > Sounds like you're on it. > > >> 171 - Aren't x0,y0 stored as subpix values? You would then be > comparing > >> a subpix value to a non-subpix value. Perhaps if the subpix calls > are > >> moved to the top of the function I think this should work OK? > > > > That's true, they are. This is very puzzling. If a horizontal > line is > > added, when the crossings for it are being computed, dxBydy should > be NaN, and > > wouldn't an error be thrown when we try to cast to an int in the > call to addCrossing? > > I'm not sure - I didn't trace it through very far - I just noted that > > the values were likely in different "resolutions". > > >> 194,197 - Shouldn't these be constants, or based on the > SUB_POS_XY? > > > > I suppose I should make a biasing constant. I don't think they > should be based > > on SUB_POS_XY though, because the biasing is done to subpixel > coordinates so > > there is no danger that if our supersampling is fine enough the > biasing will > > make the coordinates jump over some scan line. > > I'm guessing you punted on my "can of worms" suggestion then. ;-) > > >> 216 - if you save the sx0,sy0 before subpix'ing them then you don't > have > >> to "unsubpix" them here. (Though you still need the subpix sy0 for > line > >> 209 - or you could just call subpix on it for that line and at > least > >> you'd save 1 call to sub/unsub). > > > > Ok. I'll just save subpixed and unsubpixed versions of sx0, sy0. > That should > > eliminate all sx0,sy0 related calls to tosubpix and topix except in > moveTo. > > and lineT
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
> >> 721 - Arrays.sort() > > > > I thought about using this, but I did some measurements, and it > turns out that > > Arrays.sort() is a bit slower if the portion of the array being > sorted has fewer > > than about 70 elements. > > I wonder what the typical number of elements is. Is this sorting > crossings per line? Then simple primitives like circles should only > have 2 per line, right? Is it worth testing for really small numbers of > elements (much lower than 70) and doing a manual sort? Or am I > misunderstanding what is being sorted there? That's correct. For every scan line, we're sorting the crossings on that scan line, so filling a circle would have 2 per line (drawing it would have 4 per line (for most lines, anyway)). I'm not sure if it's worth it. On one hand, the performance benefits would be minimal (because for small numbers of elements, just about anything will be fast and even if insertionSort is 200% faster than quicksort it won't make much difference). On the other hand, checking for, say crossingIndices[i] - start < 50 doesn't cost us much, and I already implemented and tested insertionSort. I think we should just use Arrays.sort all the time, just to keep the file as small as possible. > If you look for something like the native code for > sun/java2d/pipe/ShapeSpanIterator.c you will see the way I typically > like to do edge setup and enumeration. > > That code uses the "half open interval" approach for both X and Y > intervals. > > I then sort the edge list by "leading Y" and then move through the > edges > using the following manner (kind of like an inch worm eating the > edges, > maintaining a pointer to the beginning and end of an "active list" of > > edges that are in play at any given time by virtue of having their Y > range intersect the current sampling Y). Note that I use an array of > > edges and then a parallel array of pointers into the edges so that I > can > sort just the pointers and avoid moving tons of edge data around. > Also, > later when I eliminate edges from the active list I just move their > pointers into and out of view rather than having to copy the edge > data. > It is harder to do an array of pointers in Java, though - perhaps an > > array of indices? Here is some basic pseudocode: > > start with lo and hi pointing at index 0 in edge list. > until edge list is exhausted { > process edges between lo and hi (empty on first pass) > scan from hi to lo and toss any edges that are exhausted > (compacting remaining pointers towards hi) > keep incrementing hi to accept any new edges coming into play > process edges between lo and hi to increment to the next Y > (note that new edges from previous step may not > need any processing in this stage if their starting > Y equals the next Y to be sampled) > } > > Gradually lo and hi make their way through the list. The edges above > hi > are always sorted in order of when they may come into play as we move > > downward in Y. The edges between lo and hi are also usually kept > sorted > by their current X so that the stage that processes them into spans > can > just run through them. The edges below lo are usually random garbage > > because no care was taken during the "pruning" step to worry about > what > happens to the pointer table down there as lo is incremented (the > algorithm only ever looks up the array of pointers). > > I hope that helps... That does help. Thanks a lot, Denis.
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
Denis Lila wrote: Hello Jim. Thank you very much for taking the time to read through this. 169 - if origLen reaches the end of the dash exactly (the "==" case) You're right, I should. I can't just replace <= with == though, because the results will be the same: in the equal case origLen will become 0, and on the next iteration, the (origLen < dash[idx]-phase) will be true, and we will do a goTo(x1,y1), which is what we just did in the previous iteration (unless dash[idx] is 0, in which case the results will be even worse). The best solution to this is to just do a nested check for the == case. Ah, right - because there is no "break" when origLen becomes zero. Sounds like you're on it. 171 - Aren't x0,y0 stored as subpix values? You would then be comparing a subpix value to a non-subpix value. Perhaps if the subpix calls are moved to the top of the function I think this should work OK? That's true, they are. This is very puzzling. If a horizontal line is added, when the crossings for it are being computed, dxBydy should be NaN, and wouldn't an error be thrown when we try to cast to an int in the call to addCrossing? I'm not sure - I didn't trace it through very far - I just noted that the values were likely in different "resolutions". 194,197 - Shouldn't these be constants, or based on the SUB_POS_XY? I suppose I should make a biasing constant. I don't think they should be based on SUB_POS_XY though, because the biasing is done to subpixel coordinates so there is no danger that if our supersampling is fine enough the biasing will make the coordinates jump over some scan line. I'm guessing you punted on my "can of worms" suggestion then. ;-) 216 - if you save the sx0,sy0 before subpix'ing them then you don't have to "unsubpix" them here. (Though you still need the subpix sy0 for line 209 - or you could just call subpix on it for that line and at least you'd save 1 call to sub/unsub). Ok. I'll just save subpixed and unsubpixed versions of sx0, sy0. That should eliminate all sx0,sy0 related calls to tosubpix and topix except in moveTo. and lineTo. You may only need 3 of those values, though, if I remember my code reading well enough. 256,264 - casting to int is problematic. It truncates towards 0 which means negatives are ceil'd and positives are floor'd. It would be best to use floor here instead. On the other hand, if negative numbers are always "off the left side of the drawable" then this is moot. That's why I left it at int casting. Do you still think I should change it to floor? If you mean floor, I think it best to use floor. Unless you can prove that negatives aren't really an issue and that the strange truncation on them won't be numerically a problem - but I don't think it is worth it for this code. Speaking of which, is there a good way to edit and build openJDK from eclipse? Then this sort of embarrassing error could be avoided (including the printStats() call). I don't use Eclipse, sorry. :-( As for Arrays.newSize()... I can't find it here: http://download.oracle.com/docs/cd/E17409_01/javase/6/docs/api/java/util/Arrays.html Is this a new function added in 7? Sorry, make that Arrays.copyOf(..., newSize). I tried to type the name from memory and got it wrong. 721 - Arrays.sort() I thought about using this, but I did some measurements, and it turns out that Arrays.sort() is a bit slower if the portion of the array being sorted has fewer than about 70 elements. I wonder what the typical number of elements is. Is this sorting crossings per line? Then simple primitives like circles should only have 2 per line, right? Is it worth testing for really small numbers of elements (much lower than 70) and doing a manual sort? Or am I misunderstanding what is being sorted there? How comfortable do you feel with that conversion? I'll try to do it and include it in a new patch along with, hopefully, a better way to iterate through strips, and especially crossings. Right now all the iteration state is saved in global variables. This is... not good. I spent far too much time last week on bugs caused by this sort of thing. Ideally, any members that can't be put at the top of a class (like our edge and crossing data) should be put in classes of their own. That sounds good, but also consider doing it in separate stages to reduce churn in the code reviewing (and you then have revs to go back and test which stage caused a probem if we find a bug later): - first get all on floats - then change strip management - then change to open coordinate intervals - (or vice versa) Do you have any ideas about how to iterate through edges in a strip without going through every edge? I was thinking of maybe using some sort of tree to split the drawing surface, but I haven't given it much thought. If you look for something like the native code for sun/java2d/pipe/ShapeSpanIterator.c you will see the
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
Hello Jim. Thank you very much for taking the time to read through this. > 91 - is there a reason to copy the dash array? Theoretically if this > object were used in an arbitrary fashion it is good sense to protect > against data changing out from under it, but in practice we only ever > feed it the array that came from a BasicStroke object which was already > protected from arbitrary changes and is copied every time we fetch it so > this is one more unfortunately extra fetch that isn't needed in > practice. (Ideally we'd instrument an extractor for BasicStroke so we > could get the raw array without the defensive copy in > getDashArray()...) You're right, I suppose there isn't. It's just the way it was being done before, and it didn't occur to me to change it. I will, in the next version of this patch. > 169 - if origLen reaches the end of the dash exactly (the "==" case) > then shouldn't you flip the dashOn variable? That would happen if you > drop through to the following code (i.e. change the test to just "<"), > but it would require a bit more calculation. If you leave the phase > equal to the dash size (which is what happens now in the "==" case) then > the next time through the if will fail and you will end up with a second > goTo to the same coordinates that the previous dash ended on before > flipping the dashOn variable and circling back - which is a waste of a > bunch of calculations. Or was there some bookkeeping issue that arose > when the "==" case dropped through? You're right, I should. I can't just replace <= with == though, because the results will be the same: in the equal case origLen will become 0, and on the next iteration, the (origLen < dash[idx]-phase) will be true, and we will do a goTo(x1,y1), which is what we just did in the previous iteration (unless dash[idx] is 0, in which case the results will be even worse). The best solution to this is to just do a nested check for the == case. > 248 - is the normalize parameter used yet? Not a complaint - I just > didn't see it getting used and wanted to make sure I didn't miss > something. No, no yet. That's next week's work :) > 257-261 - both Stroker and Dasher grab the t4 values into local > variables. Hotspot might cover up this sin if the T4 stays in the young > generation, but for a long path the T4 may move into an older generation > and take more to collect. Maybe it would be worth just passing in the 4 > values to the Stroker and Dasher constructors rather than encapsulating > them in a T4 that is immediately stripped and ignored? Definitely. I never liked using the Transform4 class, and I almost removed it in this patch (but didn't because I didn't want to change that much). This is all the reason I need to remove it. Nothing ever uses Transform4 anyway, except to just unpack the matrix entries, so all we get from it is slightly reduced parameter lists. > 367 - you left a stats printing call in here. Oops... sorry about that. > I note some confusion below over comparing subpixel values to pixel > values. Perhaps a convention where all subpixel values would be in > variables with "sub" in their name or something like that would make it > more maintainable (and correct in the short term). That's a good idea. I'll do that. > 47,48 - any reason to move these fields around? (Not a complaint - just > wondering if you had a reason.) No, no reason. It happened accidentally - I made many changes to this file before settling on this, and at one point these variables weren't even there (replaced by Math.floor and Math.ceil calls that I thought did equivalent things to the various bit operations these are used for. After running into some bugs, I reintroduced them to keep the changes as small as possible and eliminate these as bug sources). > 129 - note this could overflow, but I imagine that the bounds came from > a device clip which means there would have to be some drawable created > that was larger than MAX_INT/SUB_POS_XY. If that is true then it > probably isn't worth worrying about, but a comment might help point that out. That's true. In fact, there are a few other places where ints could overflow if the clip passed in to Renderer is large enough. I'll try to comment on them all. > 157 - Personally I'd call close before I started working on the new > data, but that's just my coding layout style. Also, does close() always > do the "right thing" if no path has been created yet? For one thing it > looks like it always results in a call to "lineTo" no matter what which > I'm guessing "happens to" fall into the short-circuit for horizontal > lines in lineTo, but I haven't verified it. In fact I think I see a > problem in that firstOrientation is not set to 0 and so if you have 2 > moveto's in a row then the second one will call close() and find that > the orientations don't match and increment flips. That isn't a critical > problem si
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
Hi Denis, This is awesome! Thanks for doing this. I agree with all of your comments below. Here are some thoughts on the new code: Dasher.java: 91 - is there a reason to copy the dash array? Theoretically if this object were used in an arbitrary fashion it is good sense to protect against data changing out from under it, but in practice we only ever feed it the array that came from a BasicStroke object which was already protected from arbitrary changes and is copied every time we fetch it so this is one more unfortunately extra fetch that isn't needed in practice. (Ideally we'd instrument an extractor for BasicStroke so we could get the raw array without the defensive copy in getDashArray()...) 169 - if origLen reaches the end of the dash exactly (the "==" case) then shouldn't you flip the dashOn variable? That would happen if you drop through to the following code (i.e. change the test to just "<"), but it would require a bit more calculation. If you leave the phase equal to the dash size (which is what happens now in the "==" case) then the next time through the if will fail and you will end up with a second goTo to the same coordinates that the previous dash ended on before flipping the dashOn variable and circling back - which is a waste of a bunch of calculations. Or was there some bookkeeping issue that arose when the "==" case dropped through? PiscesRenderingEngine.java 248 - is the normalize parameter used yet? Not a complaint - I just didn't see it getting used and wanted to make sure I didn't miss something. 257-261 - both Stroker and Dasher grab the t4 values into local variables. Hotspot might cover up this sin if the T4 stays in the young generation, but for a long path the T4 may move into an older generation and take more to collect. Maybe it would be worth just passing in the 4 values to the Stroker and Dasher constructors rather than encapsulating them in a T4 that is immediately stripped and ignored? 367 - you left a stats printing call in here. Renderer.java: I note some confusion below over comparing subpixel values to pixel values. Perhaps a convention where all subpixel values would be in variables with "sub" in their name or something like that would make it more maintainable (and correct in the short term). 47,48 - any reason to move these fields around? (Not a complaint - just wondering if you had a reason.) 129 - note this could overflow, but I imagine that the bounds came from a device clip which means there would have to be some drawable created that was larger than MAX_INT/SUB_POS_XY. If that is true then it probably isn't worth worrying about, but a comment might help point that out. 157 - Personally I'd call close before I started working on the new data, but that's just my coding layout style. Also, does close() always do the "right thing" if no path has been created yet? For one thing it looks like it always results in a call to "lineTo" no matter what which I'm guessing "happens to" fall into the short-circuit for horizontal lines in lineTo, but I haven't verified it. In fact I think I see a problem in that firstOrientation is not set to 0 and so if you have 2 moveto's in a row then the second one will call close() and find that the orientations don't match and increment flips. That isn't a critical problem since flips is only used to allocate memory, but it would allocate more than needed in that case. (not to mention emitting unnecessary linetos) Suggestion - set firstOrientation to 0 in moveto and test for firstOrientation==0 in close() so that close() doesn't add any geometry to something that has no deflections yet. 171 - Aren't x0,y0 stored as subpix values? You would then be comparing a subpix value to a non-subpix value. Perhaps if the subpix calls are moved to the top of the function I think this should work OK? 194,197 - Shouldn't these be constants, or based on the SUB_POS_XY? Warning: potential can of worms BTW, I never traced this through to see why it was needed, but I've seen a lot of renderers get afraid of vertices on pixel boundaries for similar reasons, but most of them are wrong because they miss the fact that the last coordinate on a line should be ignored by the scanline counts because of the rule of "any pixel on a horizontal edge should be outside the path if the area right below it is outside". Basically, we fill from x0 up to but not including x1 and also from y0 up to but not including y1. Thus, only y0 would trigger a scanline increment and y1 would never do so. If the following segment immediately heads up then both of those segments would "end" on the same scanline and be ignored. If the following segment continues down then the first does not trigger a scanline crossing, but the second does. If we can make sure the bookkeeping is done that way in the rest of the Renderer then we can get rid of this fudge factor entirely.
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
Hello again. I know I promised this last week, and I'm sorry it's late, but some nasty bugs popped up. The webrev is at: http://icedtea.classpath.org/~dlila/webrevs/floatingPoint/webrev/ I took this opportunity to make some changes that aren't related to floating point conversion, since this effort wasn't really directed at solving any one particular bug, but towards general refactoring and improving of Pisces (although it does solve a certain bug). 1. I removed some write-only variables in Renderer and Stroker. 2. I removed Dasher and Stroker's ability for the same object to be used with more than one output, transform, width, etc. Jim Graham said we should consider removing the no argument constructor for Dasher in a previous e-mail (which is equivalent to doing what I've done since you can't have this functionality without a no argument constructor and methods like setParameters and setOutput). I thought this was a good idea because this functionality was never being used, it clutters things up (among other things, it necessitates making the clients call setOutput and setParameters before the object can be used. This sort of thing is not a good idea since methods should be as self-contained as possible), and even if it is used, all it does is save us the creation of some objects. Since a huge number of these objects would never be needed and since they are not very expensive to create (Stroker is the biggest, and it has about 38 members), this is a premature optimization. 3. (2) allowed me to declare a whole bunch of members final (things like output, lineWidth, transformation matrix entries and anything that can't change). 4. I changed the crossing sorting algorithm in Renderer from bubble sort to insertion sort. Not a huge difference, but mine is shorter, it should perform slightly better, and I think it the algorithm is easier to understand. 5. I inserted comments on things which used to confuse me. Please check these - when reading code, there is nothing worse than a wrong explanation in a comment. 6. I removed the if(false &&... block in Renderer. I tried to make it work some time ago, but it was complicated and more trouble than it was worth - it would only be used when filling a rectangle, and not even a rectangle that was part of some path. The rectangle had to be the only thing being rendered. This is fast enough without being treated as a special case. I think that's it... As for testing: I tested this fairly thoroughly. I used dashed strokes, various affine transformations, lines longer than 2^16, and complicated GeneralPaths. Everything looks good. I also did some performance measurements. Everything is faster than it used to be, although AA rendering is still much slower than closed source java. One of the problems here is that when rendering large objects, the strips in Renderer:_endRendering will be small, and _endRenderer's complexity, not taking into account methods it calls, is O(numStrips*numEdges), because for every strip it has to go through what's left of the edge list. A better way to iterate through edges that are in a strip is needed. NOTE: I did not make changes (2) and (3) to Renderer.java because I don't have time today, and I wanted to put out something fairly polished as soon as possible. Also, I think Renderer needs quite a bit more refactoring than just that. In particular the way it stores and iterates through edges, scalines, and crossings needs to be improved. Right now it is too error-prone, and many variables have a far larger scope than they should. I would very much appreciate it if anyone could look over my changes, or comment on the last two paragraphs above. Thank you, Denis. - "Denis Lila" wrote: > Hello. > > And, I just finished it. At least it compiled successfully. I'm sure > there > are a few runtime bugs. I'll try to have a webrev out by today. > > Regards, > Denis. > > - "Jim Graham" wrote: > > > Hi Denis, > > > > float operations are pretty fast and they are usually done in a > > separate > > part of the processor so the compiler can schedule a lot of > > bookkeeping > > instructions to run in parallel with waiting for the results of the > FP > > > > instruction. In the end, they often end up being "free" if there > are > > > > enough bookkeeping instructions (branches, fetching data, etc.) in > > amongst the data. > > > > Given how many alternate instructions you are pointing out for the > > fixed > > point code it would be very likely that this would be a "win". > > > > The main reason that Pisces is implemented heavily in fixed point is > > that it was originally written for the mobile platform where there > are > > > > no FP instructions. We don't have to worry about that on the > desktop > > > > (any more). > > > > I strongly support you converting things to fp if you want to do > > it... > > > > ...jim > > > > On 7/12/2010 8:05 AM, Denis Lila wrote: >
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
Hello. And, I just finished it. At least it compiled successfully. I'm sure there are a few runtime bugs. I'll try to have a webrev out by today. Regards, Denis. - "Jim Graham" wrote: > Hi Denis, > > float operations are pretty fast and they are usually done in a > separate > part of the processor so the compiler can schedule a lot of > bookkeeping > instructions to run in parallel with waiting for the results of the FP > > instruction. In the end, they often end up being "free" if there are > > enough bookkeeping instructions (branches, fetching data, etc.) in > amongst the data. > > Given how many alternate instructions you are pointing out for the > fixed > point code it would be very likely that this would be a "win". > > The main reason that Pisces is implemented heavily in fixed point is > that it was originally written for the mobile platform where there are > > no FP instructions. We don't have to worry about that on the desktop > > (any more). > > I strongly support you converting things to fp if you want to do > it... > > ...jim > > On 7/12/2010 8:05 AM, Denis Lila wrote: > > Hello. > > > > Is it ok if I do this? I already started working on it on Friday. > > I think I should be done by tomorrow. > > > > But yes, I agree that we should convert to floating point. As for > > performance, there's also the fact that right now we're trading > > one double multiplication for 2 casts to long, 1 long > multiplication, > > 1 bit shift of a long, and 1 cast back to an int. I don't know much > > about how these are done in hardware, but it doesn't seem like > they'd > > be faster than the double multiplication. > > > > As for large coordinates, there's a bug report about it (one not > > reported by me :) ) here: > https://bugzilla.redhat.com/show_bug.cgi?id=597227 > > I submitted a matching bug report on bugs.sun.com (ID 6967436), but > I > > can't find it when I search for it. > > > > Denis. > > > > - "Jim Graham" wrote: > > > >> Sigh... > >> > >> Pisces could really stand to upgrade to floats/doubles everywhere, > for > >> > >> several reasons: > >> > >> - FPU ops are likely as fast if not faster on modern hardware due > to > >> parallel execution of FP instructions alongside regular > instructions. > >> > >> - Don't have to worry about getting coordinates larger than 32K (I > >> don't > >> think this is well tested, but I imagine that Pisces will not deal > >> with > >> it very gracefully). > >> > >> - Lots of code is greatly simplified not having to deal with the > >> semantics of how to do fixed point multiplies, divides, and > >> conversions. > >> > >> I meant to do this during the original integration, but I wanted > to > >> get > >> the task done as quickly as possible so that we could have an open > >> source alternative to the closed Ductus code so I left that task > for a > >> > >> later update. But, now that a lot of work is being done on the > code > >> to > >> fix it up, it might be better to do the float conversion now so > that > >> the > >> maintenance is easier and before we end up creating a lot of new > fixed > >> > >> point code. > >> > >> My plate is full right now, but hopefully I can interest someone > else > >> in > >> doing a cleanup cycle? (Donning my Tom Sawyer hat... ;-) > >> > >>...jim
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
Hi Denis, float operations are pretty fast and they are usually done in a separate part of the processor so the compiler can schedule a lot of bookkeeping instructions to run in parallel with waiting for the results of the FP instruction. In the end, they often end up being "free" if there are enough bookkeeping instructions (branches, fetching data, etc.) in amongst the data. Given how many alternate instructions you are pointing out for the fixed point code it would be very likely that this would be a "win". The main reason that Pisces is implemented heavily in fixed point is that it was originally written for the mobile platform where there are no FP instructions. We don't have to worry about that on the desktop (any more). I strongly support you converting things to fp if you want to do it... ...jim On 7/12/2010 8:05 AM, Denis Lila wrote: Hello. Is it ok if I do this? I already started working on it on Friday. I think I should be done by tomorrow. But yes, I agree that we should convert to floating point. As for performance, there's also the fact that right now we're trading one double multiplication for 2 casts to long, 1 long multiplication, 1 bit shift of a long, and 1 cast back to an int. I don't know much about how these are done in hardware, but it doesn't seem like they'd be faster than the double multiplication. As for large coordinates, there's a bug report about it (one not reported by me :) ) here: https://bugzilla.redhat.com/show_bug.cgi?id=597227 I submitted a matching bug report on bugs.sun.com (ID 6967436), but I can't find it when I search for it. Denis. - "Jim Graham" wrote: Sigh... Pisces could really stand to upgrade to floats/doubles everywhere, for several reasons: - FPU ops are likely as fast if not faster on modern hardware due to parallel execution of FP instructions alongside regular instructions. - Don't have to worry about getting coordinates larger than 32K (I don't think this is well tested, but I imagine that Pisces will not deal with it very gracefully). - Lots of code is greatly simplified not having to deal with the semantics of how to do fixed point multiplies, divides, and conversions. I meant to do this during the original integration, but I wanted to get the task done as quickly as possible so that we could have an open source alternative to the closed Ductus code so I left that task for a later update. But, now that a lot of work is being done on the code to fix it up, it might be better to do the float conversion now so that the maintenance is easier and before we end up creating a lot of new fixed point code. My plate is full right now, but hopefully I can interest someone else in doing a cleanup cycle? (Donning my Tom Sawyer hat... ;-) ...jim
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
Hello. Is it ok if I do this? I already started working on it on Friday. I think I should be done by tomorrow. But yes, I agree that we should convert to floating point. As for performance, there's also the fact that right now we're trading one double multiplication for 2 casts to long, 1 long multiplication, 1 bit shift of a long, and 1 cast back to an int. I don't know much about how these are done in hardware, but it doesn't seem like they'd be faster than the double multiplication. As for large coordinates, there's a bug report about it (one not reported by me :) ) here: https://bugzilla.redhat.com/show_bug.cgi?id=597227 I submitted a matching bug report on bugs.sun.com (ID 6967436), but I can't find it when I search for it. Denis. - "Jim Graham" wrote: > Sigh... > > Pisces could really stand to upgrade to floats/doubles everywhere, for > > several reasons: > > - FPU ops are likely as fast if not faster on modern hardware due to > parallel execution of FP instructions alongside regular instructions. > > - Don't have to worry about getting coordinates larger than 32K (I > don't > think this is well tested, but I imagine that Pisces will not deal > with > it very gracefully). > > - Lots of code is greatly simplified not having to deal with the > semantics of how to do fixed point multiplies, divides, and > conversions. > > I meant to do this during the original integration, but I wanted to > get > the task done as quickly as possible so that we could have an open > source alternative to the closed Ductus code so I left that task for a > > later update. But, now that a lot of work is being done on the code > to > fix it up, it might be better to do the float conversion now so that > the > maintenance is easier and before we end up creating a lot of new fixed > > point code. > > My plate is full right now, but hopefully I can interest someone else > in > doing a cleanup cycle? (Donning my Tom Sawyer hat... ;-) > > ...jim
Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code
Sigh... Pisces could really stand to upgrade to floats/doubles everywhere, for several reasons: - FPU ops are likely as fast if not faster on modern hardware due to parallel execution of FP instructions alongside regular instructions. - Don't have to worry about getting coordinates larger than 32K (I don't think this is well tested, but I imagine that Pisces will not deal with it very gracefully). - Lots of code is greatly simplified not having to deal with the semantics of how to do fixed point multiplies, divides, and conversions. I meant to do this during the original integration, but I wanted to get the task done as quickly as possible so that we could have an open source alternative to the closed Ductus code so I left that task for a later update. But, now that a lot of work is being done on the code to fix it up, it might be better to do the float conversion now so that the maintenance is easier and before we end up creating a lot of new fixed point code. My plate is full right now, but hopefully I can interest someone else in doing a cleanup cycle? (Donning my Tom Sawyer hat... ;-) ...jim