Jim, thanks for this long explanations and I agree you approach using semi-open intervals.
I have the feeling that pisces was not so accurate implementing it. My comments in the text: I have few questions regarding renderer edge handling and float ceil calls: >> > > I have a personal philosophy for all rendering engines that I've produced, > but I'm not sure the philosophy was adopted for this code. I like to use > the same rounding for every operation and have all ranges be specified as > semi-open regions - x1 <= x < x2 - for example. > > The "insideness" rule is most naturally expressed by semi-open intervals > since pixel centers that fall on a vertical boundary are inside if the > interior region is to their right (i.e. the x1 <= x half of that equation > above) and they are outside if the space to their right is non-interior > (i.e. the x < x2 half of that equation). A similar rule is also used > vertically. > > The rounding operation which computes this would be: > > xc = computed crossing value for the current scan line; > ceil(xc - 0.5); > Agreed; however, pisces code never performs such rounding: for me, ceil(x - 0.5) means (int) Math.round(x). > > This maps (for some integer k): > > k+0.4999 to k > k+0.5000 to k > k+0.5001 to k+1 > > Which is exactly what you want for the beginning of a region. If the > crossing is exactly at the pixel center then that is the furthest right > that you can cross the scan line and still include pixel k. If you just > miss the center of a pixel to the right, then the first pixel to be > included is k+1. That function computes it correctly. > > Similarly, if you apply the same formula to the right edge of a region > then you compute "the first pixel to not be included in the region. > Consider that if you miss a pixel center on that right edge by falling > slightly to the left of it, then that pixel is "outside" and it is the > first such pixel to be "outside". If you hit its pixel center exactly, > then the insideness rule also says that it is "outside". Only when you > just miss it by falling slightly to the right of its center would that > pixel still be included. The above formula computes that value if you take > its answer to be the "first pixel not included in the region". > > This lets you perform the same rounding on every value and treat every > pair of "inside trigger" to "outside trigger" as a half-open interval. Note > that otherwise you'd have to compute every floating point crossing, then > run the EvenOdd vs. ZeroWinding rule on them, and finally only do the > rounding after you knew which were left edges and which were right (which > you don't know until you've computed every crossing on the scan line). > Since this rounding rule is symmetric given half-open intervals then you > can round everything before you've analyzed the results of the winding > rules. > > I then extend that philosophy to every set of coordinates I deal with, be > it horizontal ranges, vertical ranges, shape rasterized results, or clip > results, and the math usually works out in the simplest and most elegant > way possible. > > For an AA renderer, it isn't clear that the sub-pixel computations should > be as sub-sub-pixely accurate, but I'd like to think that the philosophy > should be kept down to the lowest layers if we can, especially as this code > can be tuned down to a single sub-pixel per real pixel and then it > definitely should be as accurate as it can with the insideness rule (even > though I doubt it will ever be set up that way). > > Another consideration is the vertices on an enclosing polygon. You > compute one segment to where it ends at xe1,ye1 and start computing the > next segment from where it starts at xs2,ys2, but for adjacent segments in > an outline, they start and end on a shared common point so xe1==xs2 and > ye1==ys2. If you compute crossings for both segments at that shared point > then you have a chance of recording the crossings twice for that tiny chunk > of the outline. You could special case the last point in a segment, but > that gets into lots of tricky tests. On the other hand, if all intervals > everywhere are half-open then anything you compute for xe1,ye1 would > represent "the first line/pixel that segment 1 does not cause to be > included" and computing the same values for xs2,ys2 would naturally > generate "the first line/pixel that segment 2 causes to be included" and > only one of them would induce inclusion on that scan line or pixel. This > gets even more valuable when you consider all cases of segment 1 going down > to xe1,ye1 and segment 2 going back up from that point, or up/down or up/up > or down/down - and also left/left or left/right or right/right or > right/left. All possible cases of how one segment arrives at xe1,ye1 and > the next segment leaves from that same point just naturally fall out from > using symmetric rounding and half-open intervals. > > I wanted to get that out to see if we were on the same page or if there > was another way of doing the math that you are familiar with that might > also make things easier. > Perfect and very clear. I will then check pisces rounding either on pixel coordinates (px, py) or AA (virtual) pixel coordinates (spx, spy). > I'm pretty sure that most of the code in the current version of Pisces > uses that same philosophy, but I seem to recall that there were a couple of > places that Denis was more comfortable with fully closed intervals. I don't > remember the details, but be on the watch for it. Also, I don't recall if > it used a half-sub-pixel bias anywhere or simply punted on it as not being > visible enough to worry about strict "center of sub-pixel" comparisons. > I am perplex and I am going to check pisces code against your given approach. > Now on to the specific questions: > > > private void addLine(float x1, float y1, float x2, float y2) { >> ... >> // LBO: why ceil ie upper integer ? >> final int firstCrossing = Math.max(ceil(y1), boundsMinY); // >> upper integer >> final int lastCrossing = Math.min(ceil(y2), boundsMaxY); // >> upper integer >> > > Perhaps lastCrossing is misnamed. I think it represents the end of the > interval which is half-open. Does that match the rest of the operations > done on it? > > If every coordinate has already been biased by the -0.5 then ceil is just > the tail end of the rounding equation I gave above. > That's not the case => buggy: x1, y1 and x2, y2 are directly the point coordinates as float values. FYI, firstCrossing is used to compute the first X intersection for that Y coordinate and determine the bucketIdx: _edges[ptr+CURX] = x1 + (firstCrossing - y1) * slope; ... final int bucketIdx = firstCrossing - boundsMinY; ... _edges[ptr+NEXT] = edgeBuckets[bucketIdx]; lastCrossing is used to give the YMax edge coordinate and edgedBucketCounts: _edges[ptr+YMAX] = lastCrossing; ... edgeBucketCounts[lastCrossing - boundsMinY] |= 1; I think rounding errors can lead to pixel / shape rasterization deformations ... ? > If not, then the half-open considerations still apply if you use the same > rounding for both the top and the bottom of the segment. If lastCrossing > represents the "bottom of the shape" then it should not be included. If > the next segment continues on from it, then its y1 will be computed in the > first equation and will represent the first scanline that the next segment > participates in. > > > => firstCrossing / lastCrossing should use lower and upper integer >> values (rounding issues) ? >> > > Do they make sense now given my explanation above? Perhaps they should be > renamed to indicate their half-openness? I am a bit embarrassed to verify maths performed in ScanLineIterator.next() which use edges and edgeBucketCounts arrays ... could you have a look ? Apparently, it uses the following for loop that respects the semi-open interval philosophy: for (int i = 0, ecur, j, k; i < count; i++) { ... However, I am not sure if the edges "linked list" is traversed correctly ... boolean endRendering() { >> // TODO: perform shape clipping to avoid dealing with segments >> out of bounding box >> >> // Ensure shape edges are within bbox: >> if (edgeMinX > edgeMaxX || edgeMaxX < 0f) { >> return false; // undefined X bounds or negative Xmax >> } >> if (edgeMinY > edgeMaxY || edgeMaxY < 0f) { >> return false; // undefined Y bounds or negative Ymax >> } >> > > I'd use min >= max since if min==max then I think nothing gets generated > as a result of all edges having both the in and out crossings on the same > coordinate. Also, why not test against the clip bounds instead? The code > after that will clip the edgeMinMaxXY values against the boundsMinMax > values. If you do this kind of test after that clipping is done (on > spminmaxxy) then you can discover if all of the coordinates are outside the > clip or the region of interest. I tried here to perform few "fast" checks before doing float to int conversions (costly because millions are performed): I think it can be still improved: edgeMinX > edgeMaxX only ensures edgeMinX is defined and both are positive ! TODO: improve boundary checks here. > // why use upper integer for edge min values ? >> > > Because an edge falling between sample points includes only the sample > point "above" it. (Or "un-includes" it if it is a right/bottom edge.) > > > => Here is the question: why use ceil instead of floor ? >> >> final int eMinX = ceil(edgeMinX); // upper positive int >> if (eMinX > boundsMaxX) { >> return false; // Xmin > bbox maxX >> } >> >> final int eMinY = ceil(edgeMinY); // upper positive int >> if (eMinY > boundsMaxY) { >> return false; // Ymin > bbox maxY >> } >> >> int spminX = Math.max(eMinX, boundsMinX); >> int spmaxX = Math.min(ceil(edgeMaxX), boundsMaxX); // upper >> positive int >> int spminY = Math.max(eMinY, boundsMinY); >> int spmaxY = Math.min(ceil(edgeMaxY), boundsMaxY); // upper >> positive int >> >> int pminX = spminX >> SUBPIXEL_LG_POSITIONS_X; >> int pmaxX = (spmaxX + SUBPIXEL_MASK_X) >> >> SUBPIXEL_LG_POSITIONS_X; >> int pminY = spminY >> SUBPIXEL_LG_POSITIONS_Y; >> int pmaxY = (spmaxY + SUBPIXEL_MASK_Y) >> >> SUBPIXEL_LG_POSITIONS_Y; >> >> // store BBox to answer ptg.getBBox(): >> this.cache.init(pminX, pminY, pmaxX, pmaxY); >> >> In PiscesCache: >> void init(int minx, int miny, int maxx, int maxy) { >> ... >> // LBO: why add +1 ?? >> bboxX1 = maxx /* + 1 */; >> bboxY1 = maxy /* + 1 */; >> > > I think the values passed in are "the smallest and largest values we > encountered" and so adding one computes the first coordinate that is "not > inside the sample region" as per the half-open philosophy. > > I tend to use and encourage "min/max" naming for closed-interval values > and xy0/xy1 or xy1/xy2 for half-open-interval values. > > > => piscesCache previously added +1 to maximum (upper loop condition y < >> y1) >> but the pmaxX already uses ceil (+1) and (spmaxY + SUBPIXEL_MASK_Y) >> ensures the last pixel is reached. >> > > spmaxY + SUBPIXEL_MASK_Y computes the last sub-pixel piece of the last > pixel. It is essentially equivalen to "lastrow.9999999". So, the first > coordinate not included would be "lastrow+1" which is simply adding +1 to > the sub-pixel "max" value that was given. I understand but I figured out that pmaxX or pmaxY (pixel coordinates) may lead to upper integer: - spmaxY >> SUBPIXEL_LG_POSITIONS_Y => 6 - (spmaxY + SUBPIXEL_MASK_Y) >> SUBPIXEL_LG_POSITIONS_Y => 7 So adding +1 in piscesCache will give maxY = 8 !! > I fixed these limits to have correct rendering due to dirty rowAARLE >> reuse. >> > > If there was a bug before I'd prefer to have it fixed within the > philosophy of half-open intervals rather than trying to translate into > closed intervals - it keeps all of the code on the same page. > I think it was not a real bug as new rowAARLE[][] was big enough and zero filled so it led to only larger boundaries and tiles. > > 2013/4/24 Jim Graham <james.gra...@oracle.com <mailto: >> james.gra...@oracle.com>> >> >> Do you have any faster implementation for Math.ceil/floor ? I now use >> casting 1 + (int) / (int) but I know it is only correct for positive >> numbers (> 0). >> > > ceil(integer) == integer, but your technique would return "integer+1". I > don't remember a decent technique for doing ceil with a cast. I disagree: - ceil(float) => the smallest (closest to negative infinity) floating-point value that is greater than or equal to the argument and is equal to a mathematical integer. - floor(float) => the largest (closest to positive infinity) floating-point value that less than or equal to the argument and is equal to a mathematical integer. i.e. floor(x) < x < ceil(x) Here are numbers comparing the Math.ceil / floor operations and my casting implementation (only valid for >0): floats = [-134758.4, -131.5, -17.2, -1.9, -0.9, -1.0E-8, -1.0E-23, -0.0, 0.0, 134758.4, 131.5, 17.2, 1.9, 0.9, 1.0E-8, 1.0E-23] mathCeil = [-134758, -131, -17, -1, 0, 0, 0, 0, 0, 134759, 132, 18, 2, 1, 1, 1] fastMathCeil = [-134757, -130, -16, 0, 1, 1, 1, 1, 1, 134759, 132, 18, 2, 1, 1, 1] mathFloor = [-134759, -132, -18, -2, -1, -1, -1, 0, 0, 134758, 131, 17, 1, 0, 0, 0] fastMathFloor= [-134758, -131, -17, -1, 0, 0, 0, 0, 0, 134758, 131, 17, 1, 0, 0, 0] Micro benchmark results: # FastMathCeil: run duration: 10 000 ms 4 threads, Tavg = 44,95 ns/op (σ = 0,39 ns/op), Total ops = 890974137 [ 44,54 (224799696), 44,76 (223695800), 45,59 (219626233), 44,93 (222852408)] # MathCeil: run duration: 10 000 ms 4 threads, Tavg = 125,59 ns/op (σ = 1,71 ns/op), Total ops = 318415673 [ 125,81 (79406107), 128,11 (78061562), 125,22 (79862351), 123,33 (81085653)] > > - clipping issues: >> for very large dashed rectangle, millions of segments are emited from >> dasher (x1) to stroker (x4 segments) to renderer (x4 segments). >> >> However, I would like to add a minimal clipping algorithm but the >> rendering pipeline seems to be handling affine transforms between stages: >> /* >> * Pipeline seems to be: >> * shape.getPathIterator >> * -> inverseDeltaTransformConsumer >> * -> Dasher >> * -> Stroker >> * -> deltaTransformConsumer OR transformConsumer >> * -> pc2d = Renderer (bounding box) >> */ >> > > If we could teach the stroker to be able to apply an elliptical pen then > we could do the transforms once up front and simplify that quite a lot. > That optimization remains TBD. We currently at least detect uniform > scales and those can be handled with a single transform up front and > multiplying all dash segment values and the line width by the uniform scale > factor. But, anything that transforms a circle into an ellipse requires us > to use the multi-stage silliness above. Could you give me more details about "elliptical pen" ie shearing shapes ? Another idea to be discussed: rounding caps are quite expensive (curve break into lines) and an optimization could determine if the cap is invisible (out of the clip bounds) or visually useless = too small ie half circle spans over less than 1 pixels (true pixel or AA pixels): any idea ? maybe I should only check if the line width is too small to renderer caps in such situations... > It is complicated to perform clipping in the Renderer and maybe to late >> as a complete stroker's segment must be considered as a whole (4 >> segments without caps ...). >> >> Could you give me advices how to hack / add clipping algorithm in the >> rendering pipeline (dasher / stroker / renderer) ? >> > > Should I try to derive a bounding box for the dasher / stroker depending >> on the Affine transforms ? >> > > Yes, that sounds right so far, but it's pretty high-level and I'm not sure > where your level of understanding here falls short...? I guess what I > would do is: > > - compute an expanded "clip box" for the stroker based on the original > clip, the transform, the line width, and the miter limit. Note that the > transform applied to the stroke coordinates varies depending on whether we > use the "uniform scaling" optimization or not > > - test coordinates for being in the expanded clip box in the > PathConsumer2D methods of Stroker. If the segment is rejected, be sure to > update all of the internal computations (normal values, last moveto coords, > etc.) that might be needed for stroking the following segment > that's the plan but I would like to do it twice: in dasher and in stroker ... maybe it should be coded once in a new PiscesClipUtils class or Helpers ... > > - unfortunately this means that dasher doesn't get the advantages, you > could add similar code to the dasher, but then you have to be sure an > entire dash is omitted before you can omit the data going on, otherwise the > stroker will not get proper incoming coordinates from "the previous segment > that was out of bounds". That could be managed with proper communication > between the dasher and stroker, but then it gets more complicated. > Agreed but maybe it should be analyzed more deeply to evaluate if it is not too hard ... > > Those are the caveats that I would keep in mind while I was designing a > solution - hope it helps some... > > ...jim > PS: I have more free time until sunday (children are away), so I would like to talk to you using skype if it is possible ? If yes, could you give me your skype account and work times (GMT time please) ? I am living in Grenoble, France (GMT+1) Thanks a lot, Laurent