Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2011-01-11 Thread Jim Graham

Hi Denis,

I think the key point is when you say "it's better to have too many 
roots than too few".


Also, I'm not mathematically familiar with what the case of "D close to 
zero" means.  Generally, in floating point math, when you do a test and 
change the approach based on the results of that test I would hope that 
the answers returned by both approaches would be similar for cases near 
the decision point of the test.  So, for example, if a 3 root case of 
5,10,15 yielded a D that was "strongly negative" and a 3 root case of 
5,9.9,10.1 yielded a D that was "just barely negative" and the 2 
root case of "5,10" yielded a D that was "just barely positive" then I 
would have no problem with the fact that we classify the 2 vs. 3 root 
case on the sign of D.  The thing I haven't figured out, though, is 
whether that is indicative of the case where the 2 vs. 3 root switchover 
hapens?


So, I guess I'm OK with minimal refinement if its sole purpose is to 
eliminate "false duplicates" in the roots.


I think it is more important if it is making sure a root is on the right 
side of "0 and 1" when it is being used to solve cubic curve 
intersection problems.


And I think there is an argument to be made if it is being used to make 
the "answers that came from transcendental trig functions" more accurate 
since those functions aren't necessarily as accurate as the math used 
elsewhere in the function.


...jim

On 1/10/2011 1:13 PM, Denis Lila wrote:

Hi Jim.

I've attached another modification of your CubicSolver.java.
I tried many different things, but I think what's in the
attachment is the only satisfactory implementation. The logic
is somewhat similar to what you suggested, in that it computes
3 roots for D<  0. However, once roots are computed, it doesn't
try very hard at all to eliminate them in case too many were
computed. I tried doing this, but it was very problematic, because
the only reliable way to count roots is to split the domain into
intervals where the polynomial is strictly increasing or decreasing
by finding the roots of its derivative, evaluating the poly at the
interval end points and then counting sign changes. This works for
well behaved polynomials, but not for edge cases where D is very
small (which is the only situation where we actually need it to work)
because in cases where 3 roots are computed but only 2 exist one of
the critical points will also be a root, so the function will be
locally flat at one of its roots which will make solveEqn(eqn, 3, x)
fluctuate a lot near the root and the assumption that the function
is monotonic in each interval will not hold. Also, it's better to
have too many roots than too few.

I modified trySolve3 to count calls of solveCubicNew that find too
many or too few roots. When I run trySolve 1000 times it never finds
fewer roots than there actually are (or maybe it does but in extremely
rare cases, but I don't remember seeing any instances of this). It finds
too many roots in ~3000 cases compared to ~2500 of the version that
doesn't call fixRoots() (note that this isn't the same fixRoots that
is used by the old function). I think this is very good.

As for performance, it's not as good as the version that doesn't
call fixRoots, but accuracy has improved a lot. I tried to calibrate
Newton's method in the root refining function to get good accuracy
with as few iterations as possible. 3 iterations is a very good
compromise (although we might be able to get away with 2).

Regards,
Denis.

- Original Message -

Hi Denis,

What about logic like this:

boolean checkRoots = false;
if (D<  0) {
// 3 solution form is possible, so use it
checkRoots = (D>  -TINY); // Check them if we were borderline
// compute 3 roots as before
} else {
double u = ...;
double v = ...;
res[0] = u+v; // should be 2*u if D is near zero
if (u close to v) { // Will be true for D near zero
res[1] = -res[0]/2; // should be -u if D is near zero
checkRoots = true; // Check them if we were borderline
// Note that q=0 case ends up here as well...
}
}
if (checkRoots) {
if (num>  2&&  (res[2] == res[1] || res[2] == res[0]) {
num--;
}
if (num>  1&&  res[1] == res[0]) {
res[1] = res[--num]; // Copies res[2] to res[1] if needed
}
for (int i = num-1; i>= 0; i--) {
res[i] = refine(res[i]);
for (int j = i+1; j<  num; j++) {
if (res[i] == res[j]) {
res[i] = res[--num];
break;
}
}
}
}

Note that we lose the optimization of calculating just 2*u and -u for
the 2 root case, but that only happened in rare circumstances. Also,
if
D is near zero and negative, then we generate 3 roots using
transcendentals and potentially refine one away, but that should also
be
an uncommon situation and "there but for the grace of being a tiny
negative number would we have gone anyway" so I think it is OK to take
the long way to the answer.

Also, one could argue that if we used the transcendentals to calculate
the 3 roots, it couldn't hurt to refine the answers anyway. The other
so

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2011-01-10 Thread Denis Lila
Hi Jim.

I've attached another modification of your CubicSolver.java.
I tried many different things, but I think what's in the
attachment is the only satisfactory implementation. The logic
is somewhat similar to what you suggested, in that it computes
3 roots for D < 0. However, once roots are computed, it doesn't
try very hard at all to eliminate them in case too many were
computed. I tried doing this, but it was very problematic, because
the only reliable way to count roots is to split the domain into
intervals where the polynomial is strictly increasing or decreasing
by finding the roots of its derivative, evaluating the poly at the
interval end points and then counting sign changes. This works for
well behaved polynomials, but not for edge cases where D is very
small (which is the only situation where we actually need it to work)
because in cases where 3 roots are computed but only 2 exist one of
the critical points will also be a root, so the function will be
locally flat at one of its roots which will make solveEqn(eqn, 3, x)
fluctuate a lot near the root and the assumption that the function
is monotonic in each interval will not hold. Also, it's better to
have too many roots than too few.

I modified trySolve3 to count calls of solveCubicNew that find too
many or too few roots. When I run trySolve 1000 times it never finds
fewer roots than there actually are (or maybe it does but in extremely
rare cases, but I don't remember seeing any instances of this). It finds
too many roots in ~3000 cases compared to ~2500 of the version that
doesn't call fixRoots() (note that this isn't the same fixRoots that
is used by the old function). I think this is very good.

As for performance, it's not as good as the version that doesn't
call fixRoots, but accuracy has improved a lot. I tried to calibrate
Newton's method in the root refining function to get good accuracy
with as few iterations as possible. 3 iterations is a very good
compromise (although we might be able to get away with 2).

Regards,
Denis.

- Original Message -
> Hi Denis,
> 
> What about logic like this:
> 
> boolean checkRoots = false;
> if (D < 0) {
> // 3 solution form is possible, so use it
> checkRoots = (D > -TINY); // Check them if we were borderline
> // compute 3 roots as before
> } else {
> double u = ...;
> double v = ...;
> res[0] = u+v; // should be 2*u if D is near zero
> if (u close to v) { // Will be true for D near zero
> res[1] = -res[0]/2; // should be -u if D is near zero
> checkRoots = true; // Check them if we were borderline
> // Note that q=0 case ends up here as well...
> }
> }
> if (checkRoots) {
> if (num > 2 && (res[2] == res[1] || res[2] == res[0]) {
> num--;
> }
> if (num > 1 && res[1] == res[0]) {
> res[1] = res[--num]; // Copies res[2] to res[1] if needed
> }
> for (int i = num-1; i >= 0; i--) {
> res[i] = refine(res[i]);
> for (int j = i+1; j < num; j++) {
> if (res[i] == res[j]) {
> res[i] = res[--num];
> break;
> }
> }
> }
> }
> 
> Note that we lose the optimization of calculating just 2*u and -u for
> the 2 root case, but that only happened in rare circumstances. Also,
> if
> D is near zero and negative, then we generate 3 roots using
> transcendentals and potentially refine one away, but that should also
> be
> an uncommon situation and "there but for the grace of being a tiny
> negative number would we have gone anyway" so I think it is OK to take
> the long way to the answer.
> 
> Also, one could argue that if we used the transcendentals to calculate
> the 3 roots, it couldn't hurt to refine the answers anyway. The other
> solutions should have higher precision, but the transcendental results
> will be much less accurate.
> 
> Finally, this lacks the "refine them anyway if any of them are near 0
> or
> 1" rule - the original only did that if the transcendentals were used,
> but it would be nice to do that for any of the cases. It might make
> sense to have a variant that takes a boolean indicating whether to
> ensure higher accuracy around 0 and 1, but that would require an API
> change request...
> 
> ...jim
> 
> On 1/4/11 2:02 PM, Denis Lila wrote:
> > Hi Jim.
> >
> >> The test as it is has a test case (I just chose random numbers to
> >> check
> >> and got lucky - d'oh!) that generates 1 solution from the new code
> >> even
> >> though the equation had 2 distinct solutions that weren't even near
> >> each
> >> other...
> >
> > I figured out why this happens. It's because of cancellation in the
> > computation of D (two large numbers are subtracted and the result is
> > supposed to be 0 or close to 0, but it's about 1e-7, which wasn't
> > enough to pass the iszero test). I've been working on this and I
> > came up with a couple of different ways. They are in the attached
> > file (it's a modified version of the file your CubicSolve.java).
> >
> > The first thing I did was to modify solveCubicOld. I tried to get
> > a bit fancy and although I think I fixed the problems it had, the
> > end result is ug

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2011-01-05 Thread Denis Lila
Hi Jim.

> If you want to include these rendering tests as extra verification
> along with your other changes, then that is fine.

Ok, thanks. So, I'll update the performance webrev to include them.

> Also, I think we might have a script that forceably checks the value
> of the @bug tag and ensures that it is a legal bug database number, so
> using a RedHat bug number won't work very well. Is there an existing
> bug that this could be tagged with?

Yeah, this is a problem. I tried finding an existing sun bug for this
issue, but I couldn't, and we can't file one because it's already
fixed. It's just one test though, so we could simply leave it out.

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2011-01-05 Thread Denis Lila
Hi Jim.

> What about logic like this:
> 
> boolean checkRoots = false;
> if (D < 0) {
> // 3 solution form is possible, so use it
> checkRoots = (D > -TINY); // Check them if we were borderline
> // compute 3 roots as before
> } else {
> double u = ...;
> double v = ...;
> res[0] = u+v; // should be 2*u if D is near zero
> if (u close to v) { // Will be true for D near zero
> res[1] = -res[0]/2; // should be -u if D is near zero
> checkRoots = true; // Check them if we were borderline
> // Note that q=0 case ends up here as well...
> }
> }
> if (checkRoots) {
> if (num > 2 && (res[2] == res[1] || res[2] == res[0]) {
> num--;
> }
> if (num > 1 && res[1] == res[0]) {
> res[1] = res[--num]; // Copies res[2] to res[1] if needed
> }
> for (int i = num-1; i >= 0; i--) {
> res[i] = refine(res[i]);
> for (int j = i+1; j < num; j++) {
> if (res[i] == res[j]) {
> res[i] = res[--num];
> break;
> }
> }
> }
> }

I have two concerns about this.

1. How would refine() be implemented? Would we do something like in
findZero?
2. While I have no problem with your suggestion, it won't help us in
cases where there's a root with a multiplicity of 2 which is not found.
This was a large part of the motivation for this work. What should we
do in this case? The only solution I can see is to find the roots of
the derivative and evaluate the cubic at them, because when a root
has multiplicity > 1, the polynomial's derivative also has a root
there. This is what I'm currently doing.

Should we go ahead and push the pisces changes? The cubic solver
in pisces is good enough for what we're using it.

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2011-01-04 Thread Jim Graham

Hi Denis,

What about logic like this:

boolean checkRoots = false;
if (D < 0) {
// 3 solution form is possible, so use it
checkRoots = (D > -TINY);  // Check them if we were borderline
// compute 3 roots as before
} else {
double u = ...;
double v = ...;
res[0] = u+v; // should be 2*u if D is near zero
if (u close to v) {// Will be true for D near zero
res[1] = -res[0]/2;  // should be -u if D is near zero
checkRoots = true;  // Check them if we were borderline
// Note that q=0 case ends up here as well...
}
}
if (checkRoots) {
if (num > 2 && (res[2] == res[1] || res[2] == res[0]) {
num--;
}
if (num > 1 && res[1] == res[0]) {
res[1] = res[--num];  // Copies res[2] to res[1] if needed
}
for (int i = num-1; i >= 0; i--) {
res[i] = refine(res[i]);
for (int j = i+1; j < num; j++) {
if (res[i] == res[j]) {
res[i] = res[--num];
break;
}
}
}
}

Note that we lose the optimization of calculating just 2*u and -u for 
the 2 root case, but that only happened in rare circumstances.  Also, if 
D is near zero and negative, then we generate 3 roots using 
transcendentals and potentially refine one away, but that should also be 
an uncommon situation and "there but for the grace of being a tiny 
negative number would we have gone anyway" so I think it is OK to take 
the long way to the answer.


Also, one could argue that if we used the transcendentals to calculate 
the 3 roots, it couldn't hurt to refine the answers anyway.  The other 
solutions should have higher precision, but the transcendental results 
will be much less accurate.


Finally, this lacks the "refine them anyway if any of them are near 0 or 
1" rule - the original only did that if the transcendentals were used, 
but it would be nice to do that for any of the cases.  It might make 
sense to have a variant that takes a boolean indicating whether to 
ensure higher accuracy around 0 and 1, but that would require an API 
change request...


...jim

On 1/4/11 2:02 PM, Denis Lila wrote:

Hi Jim.


The test as it is has a test case (I just chose random numbers to
check
and got lucky - d'oh!) that generates 1 solution from the new code
even
though the equation had 2 distinct solutions that weren't even near
each
other...


I figured out why this happens. It's because of cancellation in the
computation of D (two large numbers are subtracted and the result is
supposed to be 0 or close to 0, but it's about 1e-7, which wasn't
enough to pass the iszero test). I've been working on this and I
came up with a couple of different ways. They are in the attached
file (it's a modified version of the file your CubicSolve.java).

The first thing I did was to modify solveCubicOld. I tried to get
a bit fancy and although I think I fixed the problems it had, the
end result is ugly, complicated and it has small problems, like
returning 3 very close roots when there should only be one.

The other solution is to just check if the roots of the derivative
are also roots of the cubic polynomial if only 1 root was computed
by the closed form algorithm. This doesn't have the numerical
accuracy of the first way (which used bisectRoots when things went wrong)
but it's much faster, doesn't have the multiple roots problem, and it's
much simpler. I called your trySolve function on a few hundred
polynomials with random roots in [-10, 10] and it never finds fewer
roots than there actually are. Sometimes it finds 3 roots when there are
only 2, but I don't think this is a huge problem.

I've attached what I have so far.

Regards,
Denis.

- Original Message -

Hi Denis,

I'm attaching a test program I wrote that compares the old and new
algorithms.

Obviously the old one missed a bunch of solutions because it
classified
all solutions as 1 or 3, but the new one also sometimes misses a
solution. You might want to turn this into an automated test for the
bug (and maybe use it as a stress test with a random number
generator).

I think one problem might be that you use "is close to zero" to check
if
you should use special processing. I think any tests which say "do it
this way and get fewer roots" should be conservative and if we are on
the borderline and we can do the code that generates more solutions
then
we should generate more and them maybe refine the roots and eliminate
duplicates. That way we can be (more) sure not to leave any roots
unsolved.





...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2011-01-04 Thread Denis Lila
Hi Jim.

> The test as it is has a test case (I just chose random numbers to
> check
> and got lucky - d'oh!) that generates 1 solution from the new code
> even
> though the equation had 2 distinct solutions that weren't even near
> each
> other...

I figured out why this happens. It's because of cancellation in the
computation of D (two large numbers are subtracted and the result is
supposed to be 0 or close to 0, but it's about 1e-7, which wasn't
enough to pass the iszero test). I've been working on this and I
came up with a couple of different ways. They are in the attached
file (it's a modified version of the file your CubicSolve.java).

The first thing I did was to modify solveCubicOld. I tried to get
a bit fancy and although I think I fixed the problems it had, the
end result is ugly, complicated and it has small problems, like
returning 3 very close roots when there should only be one.

The other solution is to just check if the roots of the derivative
are also roots of the cubic polynomial if only 1 root was computed
by the closed form algorithm. This doesn't have the numerical
accuracy of the first way (which used bisectRoots when things went wrong)
but it's much faster, doesn't have the multiple roots problem, and it's
much simpler. I called your trySolve function on a few hundred
polynomials with random roots in [-10, 10] and it never finds fewer
roots than there actually are. Sometimes it finds 3 roots when there are
only 2, but I don't think this is a huge problem.

I've attached what I have so far.

Regards,
Denis.

- Original Message -
> Hi Denis,
> 
> I'm attaching a test program I wrote that compares the old and new
> algorithms.
> 
> Obviously the old one missed a bunch of solutions because it
> classified
> all solutions as 1 or 3, but the new one also sometimes misses a
> solution. You might want to turn this into an automated test for the
> bug (and maybe use it as a stress test with a random number
> generator).
> 
> I think one problem might be that you use "is close to zero" to check
> if
> you should use special processing. I think any tests which say "do it
> this way and get fewer roots" should be conservative and if we are on
> the borderline and we can do the code that generates more solutions
> then
> we should generate more and them maybe refine the roots and eliminate
> duplicates. That way we can be (more) sure not to leave any roots
> unsolved.
> 

> 
> ...jim
import java.awt.geom.QuadCurve2D;
import java.util.Arrays;
import java.util.Random;

import static java.lang.Math.abs;
import static java.lang.Math.max;
import static java.lang.Math.ulp;

public class CubicSolver {
public static int solveCubicOld(double eqn[], double res[]) {
if (res == eqn) {
// Copy the eqn so that we don't clobber it with the
// roots.
eqn = new double[4];
System.arraycopy(res, 0, eqn, 0, 4);
}

// From Numerical Recipes, 5.6, Quadratic and Cubic Equations
double d = eqn[3];
if (d == 0.0) {
// The cubic has degenerated to quadratic (or line or ...).
return QuadCurve2D.solveQuadratic(eqn, res);
}
double a = eqn[2] / d;
double b = eqn[1] / d;
double c = eqn[0] / d;
int roots = 0;
double Q = (a * a - 3.0 * b) / 9.0;
double R = (2.0 * a * a * a - 9.0 * a * b + 27.0 * c) / 54.0;
double R2 = R * R;
double Q3 = Q * Q * Q;
a = a / 3.0;
if (R2 < Q3) {
double theta = Math.acos(R / Math.sqrt(Q3));
Q = -2.0 * Math.sqrt(Q);

res[roots++] = Q * Math.cos(theta / 3.0) - a;
res[roots++] = Q * Math.cos((theta + Math.PI * 2.0)/ 3.0) - a;
res[roots++] = Q * Math.cos((theta - Math.PI * 2.0)/ 3.0) - a;
} else {
boolean neg = (R < 0.0);
double S = Math.sqrt(R2 - Q3);
if (neg) {
R = -R;
}
double A = Math.pow(R + S, 1.0 / 3.0);
if (!neg) {
A = -A;
}
double B = (A == 0.0) ? 0.0 : (Q / A);
res[roots++] = (A + B) - a;
}

	// we need it to have length 4. We will put the roots of the derivative
	// in deriv[1] and deriv[2]
	final double[] deriv = {eqn[1], 2*eqn[2], 3*eqn[3], 0};
int critCount = QuadCurve2D.solveQuadratic(deriv, deriv);
Arrays.sort(deriv, 0, critCount);
Arrays.sort(res, 0, roots);
// Even if there are fewer than 2 roots, this won't cause problems.
deriv[2] = deriv[1];
deriv[1] = deriv[0];
// The roots of any polynomial must lie in [-M, M] where M = 1 + (max{i=0,n-1}abs(ai))/abs(an)
// http://en.wikipedia.org/wiki/Sturm%27s_theorem#Applications
// Wikipedia says this result is due to Cauchy. There's no proof in the link,
// but I proved it myself (it's a bit long to include here).
doubl

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-28 Thread Jim Graham

Correction...

On 12/28/2010 3:00 PM, Jim Graham wrote:

math. (Though it begs the question - is "-q/sqrt(-p^3)" more accurate
than "-q/(p*sqrt(-p)"? If p is < 1 then the cube is an even smaller
number, does that matter?)


Make that "-q/(-p*sqrt(-p))", or "q/(p*sqrt(-p))"...

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-28 Thread Jim Graham

Hi Denis,

I'm attaching a test program I wrote that compares the old and new 
algorithms.


Obviously the old one missed a bunch of solutions because it classified 
all solutions as 1 or 3, but the new one also sometimes misses a 
solution.  You might want to turn this into an automated test for the 
bug (and maybe use it as a stress test with a random number generator).


I think one problem might be that you use "is close to zero" to check if 
you should use special processing.  I think any tests which say "do it 
this way and get fewer roots" should be conservative and if we are on 
the borderline and we can do the code that generates more solutions then 
we should generate more and them maybe refine the roots and eliminate 
duplicates.  That way we can be (more) sure not to leave any roots unsolved.


The test as it is has a test case (I just chose random numbers to check 
and got lucky - d'oh!) that generates 1 solution from the new code even 
though the equation had 2 distinct solutions that weren't even near each 
other...


...jim



CubicSolver.java
Description: java/


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-28 Thread Jim Graham

Aha!

I finally figured out what I was missing.

On 12/24/2010 4:43 PM, Denis Lila wrote:

Line 1133 - I don't understand why that term has -q in it. The above
link and the original code both computed essentially the arccos of
this


Basically, the negative comes in when you push all of the p terms back 
inside the sqrt().


They had cos(3phi) = (3Q/2P)sqrt(-3/P)
   = (q/p)*sqrt(-1/p)
   = (-q/-p)*sqrt(-1/p)
   = -q*(-1/p)*sqrt(-1/p)
   = -q/(-p*sqrt(-p))
   = -q/(sqrt(-p^3)
which is what you calculate, so we are not calculating the negative of 
the angle after all - we are calculating the same angle using different 
math.  (Though it begs the question - is "-q/sqrt(-p^3)" more accurate 
than "-q/(p*sqrt(-p)"?  If p is < 1 then the cube is an even smaller 
number, does that matter?)



Let X = (3q/2p)*sqrt(-3/p) where p and q are the ones from the wikipedia
article, not our code.
So, when we do ret[0] = t * cos(phi), that's:
   = t * cos(acos(-X))


actually, it really was t*cos(acos(X)), not -X.


   = t * cos(pi/3 - acos(X))
   = t * cos(acos(X) - pi/3)
   = t * cos(acos(X) - pi/3 - pi)


There was an error here in this step as well, this line should read:
 = t * cos(acos(X) - pi/3 - 2pi)
 = t * cos(acos(X) -7pi/3)
 = nothing because this isn't our math... ;-)


I unfortunately don't have access to the icedtea servers at this moment,
so I attached a patch. I hope that's ok.


Have you updated the webrev yet?

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-24 Thread Denis Lila
Hi Jim.

> Unfortunately, this means that the names here
> and the values assigned to them and the comment above them conflict.
> If the variables could be named "p/3" and "q/2" then all would be clear,
> but I don't know how to do that naming very easily. Perhaps the
> comment could be simply reworded:
> 
> // substitute 
> // x^3 + Px + Q = 0
> // Since we actually need P/3 and Q/2 for all of the
> // calculations that follow, we will calculate
> // p = P/3
> // q = Q/2
> // instead and use those values for simplicity of the code.

Good point. Done.

> Line 1105 - single quotes in comments freaks out my version of
> gnuemacs.
> I usually try to avoid them, except in pairs, but there isn't a better
> way to word this comment. :-(

We could always go with "the method of Cardano", although
that doesn't sound good at all. Or we could use a backtick instead of
the single quote - it looks similar enough.

> Lines 1157-1163 - the old code used to copy the eqn before it got
> clobbered with roots. Here it is too late. You probably need to move
> this code up near line 1135 before the 3 roots are stuffed into the
> res array. (Did you test the eqn==res case?)

You're right. I'm sorry about this.

> I noticed that the "Casus irreducibilis" case isn't in Cordano's
> method.
> He only finds roots for the 1 and 2 root case and punts for 3 roots.
> So, this is someone else's method. It would be nice to figure out who
> or what and list a reference, even though the Graphics Gems and the
> old
> code didn't. The closest reference I can find is unattributed on
> Wikipedia, but you could include it in a comment for reference:
> 
> http://en.wikipedia.org/wiki/Cubic_function#Trigonometric_.28and_hyperbolic.29_method

Done.

> Line 1133 - I don't understand why that term has -q in it. The above
> link and the original code both computed essentially the arccos of
> this
> formula without the negation of q. ??? Since acos(-v) == pi - acos(v)
> this would seem to negate the result and bias it by pi/3. Negating it
> won't affect the eventual cosine, but the bias by pi/3 will. Am I
> missing something?

I think you are. What's going on is that our code is computing
ret[0] = t2, ret[1] = t0, ret[2] = t1, where (t0, t1, t2 are the tk's
from the wikipedia link).

Let X = (3q/2p)*sqrt(-3/p) where p and q are the ones from the wikipedia
article, not our code.
So, when we do ret[0] = t * cos(phi), that's:
  = t * cos(acos(-X))
  = t * cos(pi/3 - acos(X))
  = t * cos(acos(X) - pi/3)
  = t * cos(acos(X) - pi/3 - pi)
  = t * cos(acos(X) - 2*2*pi/3)
  = t2

ret[0] and ret[1] are very similar to prove - you just add/subtract pi/3
from the argument to cos.

I unfortunately don't have access to the icedtea servers at this moment,
so I attached a patch. I hope that's ok.

Regards,
Denis.
--- dashing/2d/jdk/src/share/classes/java/awt/geom/CubicCurve2D.java	2010-12-24 10:01:45.378556843 -0500
+++ cc2d/2d/jdk/src/share/classes/java/awt/geom/CubicCurve2D.java	2010-12-24 11:01:42.205614715 -0500
@@ -1083,24 +1083,63 @@
  * @since 1.3
  */
 public static int solveCubic(double eqn[], double res[]) {
-// From Numerical Recipes, 5.6, Quadratic and Cubic Equations
-double d = eqn[3];
-if (d == 0.0) {
-// The cubic has degenerated to quadratic (or line or ...).
+// From Graphics Gems:
+// http://tog.acm.org/resources/GraphicsGems/gems/Roots3And4.c
+final double d = eqn[3];
+if (d == 0) {
 return QuadCurve2D.solveQuadratic(eqn, res);
 }
-double a = eqn[2] / d;
-double b = eqn[1] / d;
-double c = eqn[0] / d;
-int roots = 0;
-double Q = (a * a - 3.0 * b) / 9.0;
-double R = (2.0 * a * a * a - 9.0 * a * b + 27.0 * c) / 54.0;
-double R2 = R * R;
-double Q3 = Q * Q * Q;
-a = a / 3.0;
-if (R2 < Q3) {
-double theta = Math.acos(R / Math.sqrt(Q3));
-Q = -2.0 * Math.sqrt(Q);
+/* normal form: x^3 + Ax^2 + Bx + C = 0 */
+final double A = eqn[2] / d;
+final double B = eqn[1] / d;
+final double C = eqn[0] / d;
+
+
+//  substitute x = y - A/3 to eliminate quadratic term:
+// y^3 +Py + Q = 0
+//
+// Since we actually need P/3 and Q/2 for all of the
+// calculations that follow, we will calculate
+// p = P/3
+// q = Q/2
+// instead and use those values for simplicity of the code.
+
+double sq_A = A * A;
+double p = 1.0/3 * (-1.0/3 * sq_A + B);
+double q = 1.0/2 * (2.0/27 * A * sq_A - 1.0/3 * A * B + C);
+
+/* use Cardano's formula */
+
+double cb_p = p * p * p;
+double D = q * q + cb_p;
+
+int num;
+// XXX: we consider 0 to be anything within 1e-9 of 0.
+// Is this really right? Maybe 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-23 Thread Jim Graham
The regression tests for this bug do not call the method directly.  They 
may exercise the function indirectly in some pipelines, but not all 
pipelines will use this method (the current version of Pisces in OpenJDK 
doesn't even use it until you integrate your other changes as far as I 
know).


I'd write a regression test for this bug that is directly applicable to 
the method being tested (find what values are being handed to the method 
by these test cases and then just call Cubic.solveCubic directly with 
those values and figure out if the answers are reasonable).


If you want to include these rendering tests as extra verification along 
with your other changes, then that is fine.


Also, I think we might have a script that forceably checks the value of 
the @bug tag and ensures that it is a legal bug database number, so 
using a RedHat bug number won't work very well.  Is there an existing 
bug that this could be tagged with?


...jim

On 12/20/2010 9:36 AM, Denis Lila wrote:

Hi Jim.


Lines 1094-1096, they could also be NaN if any of the numerators were
also zero and these tests might fail (but only for the case of all of
them being zero I guess, otherwise one of the other divisions would
result in infinity). Are accidental infinities (caused by overflow
rather than d==0.0) really a problem for the code to handle?


I'm not sure if they're a problem, but I thought that case should have
been handled just for robustness. However, I've changed the test
to d==0 because testing for infinities should be done, but not there.
For example, if the constant term was huge and d==0.5 we could get
an infinity but that shouldn't really be handled as a quadratic
polynomial. I will deal better with these cases in a future webrev.


I just noticed that the code you are replacing actually used to refine
the roots so maybe you should do some of this magic.


I missed that in the original code. I changed it now.


Also, in the webrev you'll find five regression tests that I would like
to push to openjdk7. They test for various problems the rendering engine
used to have. They're all pretty simple and I would appreciate it if you
could take a quick look at them. They're in the same webrev as cc2d because
it was more convenient for me, but obviously when/if they're pushed they
will be a separate changeset.

One more thing: there is a regression test for the rendering engine
called TestNPE that I think is problematic because it doesn't
necessarily test the rendering engine. It just draws an antialiased
line, which could be handled in any number of ways, so it's not very
robust. In fact, after we're done with the parallelogram pipelines,
the code that used to throw the NPE won't even execute, making this
test useless. We should either discard it or change it to use the
rendering engine explicitly, like my tests do. What do you think?

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-23 Thread Jim Graham

Hi Denis,

Line 1099 - I decided to check out Cordano's method and noticed a 
discrepancy.  The comment here says we are calculating the p and q for 
this equation, but the values assigned to the p and q variables in lines 
1102,1103 happen to be p/3 and q/2.  That's fine because almost all of 
the values needed in the remaining logic in Cordano's method actually 
need those values instead of the original p and q so it makes sense to 
calculate them up front.  Unfortunately, this means that the names here 
and the values assigned to them and the comment above them conflict.  If 
the variables could be named "p/3" and "q/2" then all would be clear, 
but I don't know how to do that naming very easily.  Perhaps the comment 
could be simply reworded:


// substitute 
//x^3 + Px + Q = 0
// Since we actually need P/3 and Q/2 for all of the
// calculations that follow, we will calculate
//p = P/3
//q = Q/2
// instead and use those values for simplicity of the code.

Line 1105 - single quotes in comments freaks out my version of gnuemacs. 
 I usually try to avoid them, except in pairs, but there isn't a better 
way to word this comment.  :-(


Lines 1157-1163 - the old code used to copy the eqn before it got 
clobbered with roots.  Here it is too late.  You probably need to move 
this code up near line 1135 before the 3 roots are stuffed into the res 
array.  (Did you test the eqn==res case?)


I noticed that the "Casus irreducibilis" case isn't in Cordano's method. 
 He only finds roots for the 1 and 2 root case and punts for 3 roots. 
So, this is someone else's method.  It would be nice to figure out who 
or what and list a reference, even though the Graphics Gems and the old 
code didn't.  The closest reference I can find is unattributed on 
Wikipedia, but you could include it in a comment for reference:


http://en.wikipedia.org/wiki/Cubic_function#Trigonometric_.28and_hyperbolic.29_method

Line 1133 - I don't understand why that term has -q in it.  The above 
link and the original code both computed essentially the arccos of this 
formula without the negation of q.  ???  Since acos(-v) == pi - acos(v) 
this would seem to negate the result and bias it by pi/3.  Negating it 
won't affect the eventual cosine, but the bias by pi/3 will.  Am I 
missing something?


...jim

On 12/20/2010 9:36 AM, Denis Lila wrote:

Hi Jim.


Lines 1094-1096, they could also be NaN if any of the numerators were
also zero and these tests might fail (but only for the case of all of
them being zero I guess, otherwise one of the other divisions would
result in infinity). Are accidental infinities (caused by overflow
rather than d==0.0) really a problem for the code to handle?


I'm not sure if they're a problem, but I thought that case should have
been handled just for robustness. However, I've changed the test
to d==0 because testing for infinities should be done, but not there.
For example, if the constant term was huge and d==0.5 we could get
an infinity but that shouldn't really be handled as a quadratic
polynomial. I will deal better with these cases in a future webrev.


I just noticed that the code you are replacing actually used to refine
the roots so maybe you should do some of this magic.


I missed that in the original code. I changed it now.


Also, in the webrev you'll find five regression tests that I would like
to push to openjdk7. They test for various problems the rendering engine
used to have. They're all pretty simple and I would appreciate it if you
could take a quick look at them. They're in the same webrev as cc2d because
it was more convenient for me, but obviously when/if they're pushed they
will be a separate changeset.

One more thing: there is a regression test for the rendering engine
called TestNPE that I think is problematic because it doesn't
necessarily test the rendering engine. It just draws an antialiased
line, which could be handled in any number of ways, so it's not very
robust. In fact, after we're done with the parallelogram pipelines,
the code that used to throw the NPE won't even execute, making this
test useless. We should either discard it or change it to use the
rendering engine explicitly, like my tests do. What do you think?

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-20 Thread Denis Lila
Hi Jim.

> Lines 1094-1096, they could also be NaN if any of the numerators were
> also zero and these tests might fail (but only for the case of all of
> them being zero I guess, otherwise one of the other divisions would
> result in infinity). Are accidental infinities (caused by overflow
> rather than d==0.0) really a problem for the code to handle?

I'm not sure if they're a problem, but I thought that case should have
been handled just for robustness. However, I've changed the test
to d==0 because testing for infinities should be done, but not there.
For example, if the constant term was huge and d==0.5 we could get
an infinity but that shouldn't really be handled as a quadratic
polynomial. I will deal better with these cases in a future webrev.

> I just noticed that the code you are replacing actually used to refine
> the roots so maybe you should do some of this magic.

I missed that in the original code. I changed it now.


Also, in the webrev you'll find five regression tests that I would like
to push to openjdk7. They test for various problems the rendering engine
used to have. They're all pretty simple and I would appreciate it if you
could take a quick look at them. They're in the same webrev as cc2d because
it was more convenient for me, but obviously when/if they're pushed they
will be a separate changeset.

One more thing: there is a regression test for the rendering engine
called TestNPE that I think is problematic because it doesn't
necessarily test the rendering engine. It just draws an antialiased
line, which could be handled in any number of ways, so it's not very
robust. In fact, after we're done with the parallelogram pipelines,
the code that used to throw the NPE won't even execute, making this
test useless. We should either discard it or change it to use the
rendering engine explicitly, like my tests do. What do you think?

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-17 Thread Jim Graham

Hi Denis,

Lines 1094-1096, they could also be NaN if any of the numerators were 
also zero and these tests might fail (but only for the case of all of 
them being zero I guess, otherwise one of the other divisions would 
result in infinity).  Are accidental infinities (caused by overflow 
rather than d==0.0) really a problem for the code to handle?


I don't have any recommendations on your comment about the code that 
tests q for zero.  You could probably check if 2*u and -u were distinct 
as an alternate test, but that would cost you a cbrt() call.  In 
general, though, I'm guessing it's a rare case so saving the call to 
cbrt() is probably not important.  I will note, though, that if a number 
is very close to 0, but not 0, then its cube root will be a larger 
number than the original.


I just noticed that the code you are replacing actually used to refine 
the roots so maybe you should do some of this magic.  However, it only 
bothered to refine the roots if there were 3 roots and they were near 0 
or 1 (because that might cause the caller to reject the root if it fell 
on the wrong side of 0 or 1).  Either way, look and see if fixRoots() 
and its friends are dead code now and/or if they should be replaced with 
your refinement techniques below.


I tried to review the code for correctness of formulae, but since I 
never really understood how the first version worked (I just copied it 
from Numerical Recipes), all I could do was to compare to the old 
version that it was similar to.  Frustratingly, the variable names 
conflicted and one of the values was calculated with reversed sign so it 
ended up being more frustrating than educational.  :-(


Since you apparently tested the new code extensively and got it from a 
source that had some amount of editorial review - I'll trust that 
process instead of my crossed eyes...


...jim

On 12/15/2010 7:13 AM, Denis Lila wrote:

Hi Jim.


Also, I wrote new hit testing code in jdk6 that used bezier recursion
to compute the values and it ran way faster than any root-finding methods
(mainly because all you have to worry about is subdividing enough that
the curve can be classified as above, below, or to the left or right
and you're done), so the containment methods could probably be fixed by
simply using the new code in sun.awt.geom.Curve and this method could
be updated with the new equations you found and left as an "approximate
method" like it always has been?


Well, I already have half the code I need to implement the ideas I
wrote, so... why not?

However, making solveCubic that good does not really seem to be a high
priority, so how about we quickly push just the new implementation I
found so we can fix most cases where an obvious root is missed, then
push this dashing/AA webrev (which will depend on the new solveCubic),
then I can fix the implementation of intersect using the recursive
subdivision you mentioned, and then I can take my time finishing
the implementation of the ideas from my last e-mail (these bugs/rfes
have waited 10+ years - they can wait 1-2 more months). Right now,
I would like to give priority to better pisces support of the new
parallelogram pipes and this bug:
https://bugzilla.redhat.com/show_bug.cgi?id=662230

Here's the webrev for the new solveCubic implementation:
http://icedtea.classpath.org/~dlila/webrevs/cc2d/webrev/

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-15 Thread Denis Lila
Hi Jim.

> Also, I wrote new hit testing code in jdk6 that used bezier recursion
> to compute the values and it ran way faster than any root-finding methods
> (mainly because all you have to worry about is subdividing enough that
> the curve can be classified as above, below, or to the left or right
> and you're done), so the containment methods could probably be fixed by 
> simply using the new code in sun.awt.geom.Curve and this method could
> be updated with the new equations you found and left as an "approximate 
> method" like it always has been?

Well, I already have half the code I need to implement the ideas I
wrote, so... why not?

However, making solveCubic that good does not really seem to be a high
priority, so how about we quickly push just the new implementation I
found so we can fix most cases where an obvious root is missed, then
push this dashing/AA webrev (which will depend on the new solveCubic),
then I can fix the implementation of intersect using the recursive
subdivision you mentioned, and then I can take my time finishing
the implementation of the ideas from my last e-mail (these bugs/rfes
have waited 10+ years - they can wait 1-2 more months). Right now,
I would like to give priority to better pisces support of the new
parallelogram pipes and this bug:
https://bugzilla.redhat.com/show_bug.cgi?id=662230

Here's the webrev for the new solveCubic implementation:
http://icedtea.classpath.org/~dlila/webrevs/cc2d/webrev/

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-14 Thread Jim Graham

Hi Denis,

That sounds like some very good ideas for making this method very accurate.

On the other hand, we're starting to get into the territory where an 
advanced math package should be catering to these requirements.  The 
solveCubic was an internal helper function for implementing the hit 
testing methods that I decided to make public back in 1.2 days.  There's 
a question on how much accuracy we should bother with.


Also, I wrote new hit testing code in jdk6 that used bezier recursion to 
compute the values and it ran way faster than any root-finding methods 
(mainly because all you have to worry about is subdividing enough that 
the curve can be classified as above, below, or to the left or right and 
you're done), so the containment methods could probably be fixed by 
simply using the new code in sun.awt.geom.Curve and this method could be 
updated with the new equations you found and left as an "approximate 
method" like it always has been?


...jim

On 12/14/2010 5:57 PM, Denis Lila wrote:

Hi Jim.


How big are these errors expressed as multiples of the ULP of the
coefficients?  Obviously 1e-17 is a lot smaller than 1e-4, but was
1e-17
representing "just a couple of bits of error" or was it still way off
with respect to the numbers being used? And were these fairly obscure
equations that were off?


The coefficients I used were eqn={-0.1, 0, 1, 1e-7} so compared to the ulps
of the coefficients, 1e-4 is pretty large.

I'm about to go now, but I would like to write this idea first:
it seems to me like the number of roots reported is much more
important than whether their accuracy is 1e-4 or 1e-17. So,
how about we solve for the roots of the derivative (which can be
done very quickly). Computing lim{x-->+/-inf}f(x) is very easy
(just a test on the most significant coefficient). We can then
evaluate the polynomial on the critical points and this would
allow us to very quickly compute the exact number of roots. If
the number computed using the closed form formula does not
match the real number, we do some refining work.

If we really wanted to optimize, we could also record how close
constants like D and q are to 0, and if they're within a certain
threshold, we could mark the roots they spawn as "suspicious",
and only do the test in the above paragraph (i.e. solving for
critical points) if there are suspicious roots. And if the
computed numbers of roots don't match up, we could concentrate
on refining only the "suspicious" roots.

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-14 Thread Denis Lila
Hi Jim.

> How big are these errors expressed as multiples of the ULP of the 
> coefficients?  Obviously 1e-17 is a lot smaller than 1e-4, but was
> 1e-17 
> representing "just a couple of bits of error" or was it still way off
> with respect to the numbers being used? And were these fairly obscure
> equations that were off?

The coefficients I used were eqn={-0.1, 0, 1, 1e-7} so compared to the ulps
of the coefficients, 1e-4 is pretty large.

I'm about to go now, but I would like to write this idea first:
it seems to me like the number of roots reported is much more
important than whether their accuracy is 1e-4 or 1e-17. So,
how about we solve for the roots of the derivative (which can be
done very quickly). Computing lim{x-->+/-inf}f(x) is very easy
(just a test on the most significant coefficient). We can then
evaluate the polynomial on the critical points and this would
allow us to very quickly compute the exact number of roots. If
the number computed using the closed form formula does not
match the real number, we do some refining work.

If we really wanted to optimize, we could also record how close
constants like D and q are to 0, and if they're within a certain
threshold, we could mark the roots they spawn as "suspicious",
and only do the test in the above paragraph (i.e. solving for
critical points) if there are suspicious roots. And if the
computed numbers of roots don't match up, we could concentrate
on refining only the "suspicious" roots.

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-14 Thread Jim Graham

Hi Denis,

On 12/14/2010 5:11 PM, Denis Lila wrote:

I have one question though: how fast does this have to be? I can come
up with fairly reasonable examples for which both CubicCurve2D.solveCubic
and the implementation I found give very inaccurate results


The existing method was not very accurate as you discovered so we don't 
necessarily need to go too far in terms of accuracy if it means changing 
its performance radically.



(i.e. evaluating the cubic on the computed "roots" gives numbers as
high as 1e-4). I read on the bug reports that what we should do is
treat the closed form as a crude approximation and then use the Newton
method to really find the roots. I like this idea (with perhaps the
exception of the Newton method - I think we might be better off
using something like false position, or even simply a binary search).
The binary search gives results that when evaluated are as small as
1e-17 which is as close to correct as possible in double precision
(because my termination condition was while(middle != start&&  middle != end)).
That didn't even take an outrageous number of iterations - 77 was the
highest I observed.


How big are these errors expressed as multiples of the ULP of the 
coefficients?  Obviously 1e-17 is a lot smaller than 1e-4, but was 1e-17 
representing "just a couple of bits of error" or was it still way off 
with respect to the numbers being used?  And were these fairly obscure 
equations that were off?


Not knowing these answers, it is hard to prioritize the added accuracy 
against the performance hit.


If the hit is very small for the vast majority of equations, and/or if 
we had a test for "already close enough" that eliminated the need for 
refining most roots then it might be worth it - otherwise we could 
consider adding a second version that called the first and then refined 
the results so the developer could choose the accuracy they needed.


Feel like running some timings?


4724552
4493128
4074742
4724556
(etc.  Those were just the bugs I found on the first 2 pages of a bug
database search)
Double (triple, etc.) credit - woohoo!  ;-)


Isn't there some sort of diminishing returns after the first duplicate ;-)


Not really, but if the number of dups is higher than some unknown value 
then people start looking at you with raised eyebrows...


http://i18.tinypic.com/34461wi.jpg

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-14 Thread Denis Lila
Hi Jim.

> You might want to submit it as a separate push and get credit for
> fixing 4645692 (solveCubic doesn't return all answers),

Sure, that sounds good. Reading through the code I found, I spotted
a few things that might have been problematic in some extremely
rare cases. I've been working on making it extra robust. I think I'm
pretty close to finishing.
I have one question though: how fast does this have to be? I can come
up with fairly reasonable examples for which both CubicCurve2D.solveCubic
and the implementation I found give very inaccurate results
(i.e. evaluating the cubic on the computed "roots" gives numbers as
high as 1e-4). I read on the bug reports that what we should do is
treat the closed form as a crude approximation and then use the Newton
method to really find the roots. I like this idea (with perhaps the
exception of the Newton method - I think we might be better off
using something like false position, or even simply a binary search).
The binary search gives results that when evaluated are as small as
1e-17 which is as close to correct as possible in double precision
(because my termination condition was while(middle != start && middle != end)).
That didn't even take an outrageous number of iterations - 77 was the
highest I observed.

> 4724552
> 4493128
> 4074742
> 4724556
> (etc.  Those were just the bugs I found on the first 2 pages of a bug
> database search)
> Double (triple, etc.) credit - woohoo!  ;-)

Isn't there some sort of diminishing returns after the first duplicate ;-)

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-13 Thread Jim Graham

Hi Denis,

Those sound like just the kind of problems I believed existed in the 
CC2D algorithm.


You might want to submit it as a separate push and get credit for fixing 
4645692 (solveCubic doesn't return all answers), and maybe even the 
following failures in the containment methods (which could be closed as 
dups if this fixes the problems) as well:


4724552
4493128
4074742
4724556
(etc.  Those were just the bugs I found on the first 2 pages of a bug 
database search)


Double (triple, etc.) credit - woohoo!  ;-)

...jim

On 12/13/2010 2:30 PM, Denis Lila wrote:



Very nice!  How does it compare to CubicCurve.solveCubic() (which I
know
has issues with the 2 root case, but I got it from a "reliable source"
-
some textbook on Numerical Recipes)?


I wrote a tests that generated 2559960 polynomials, and in 2493075 of
those, the computed roots were identical to within 1e-9. They were
different in the remaining 66885 cases, so that's 97.4% agreement.

I've looked through some of the differences, and in every case the
function from graphics gems is better in one of two ways:
1. the gg version will report more roots than the cc2d version, in
which case the polynomial has a double root and the cc2d version
completely misses it (example poly: a=19.00, b=-20.00,
c=-17.00, d=18.00, where cc2d misses the root at 1).

2. the gg version will report fewer roots than the cc2d version, in
which case there was a 0 root and the cc2d version incorrectly split
it into -1e-163 and 1e-163.

So, the graphics gems version seems to be much more stable. It
does have a problem where it can return NaN sometimes, because it
assumes that the polynomial is not a quadratic one, but that can
easily be fixed.

So, should I put this new cubic root finder in CubicCurve.solveCubic
and use that in pisces?

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-13 Thread Denis Lila

> Very nice!  How does it compare to CubicCurve.solveCubic() (which I
> know 
> has issues with the 2 root case, but I got it from a "reliable source"
> - 
> some textbook on Numerical Recipes)?

I wrote a tests that generated 2559960 polynomials, and in 2493075 of
those, the computed roots were identical to within 1e-9. They were
different in the remaining 66885 cases, so that's 97.4% agreement.

I've looked through some of the differences, and in every case the
function from graphics gems is better in one of two ways:
1. the gg version will report more roots than the cc2d version, in
which case the polynomial has a double root and the cc2d version
completely misses it (example poly: a=19.00, b=-20.00,
c=-17.00, d=18.00, where cc2d misses the root at 1).

2. the gg version will report fewer roots than the cc2d version, in
which case there was a 0 root and the cc2d version incorrectly split
it into -1e-163 and 1e-163.

So, the graphics gems version seems to be much more stable. It
does have a problem where it can return NaN sometimes, because it
assumes that the polynomial is not a quadratic one, but that can
easily be fixed.

So, should I put this new cubic root finder in CubicCurve.solveCubic
and use that in pisces?

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-13 Thread Jim Graham
Very nice!  How does it compare to CubicCurve.solveCubic() (which I know 
has issues with the 2 root case, but I got it from a "reliable source" - 
some textbook on Numerical Recipes)?


Also, one area that I had issues with the version I used in CC2D was 
that it chose a hard cutoff to classify the number of points and 
floating point errors might cause a given cubic to jump over that point 
despite the fact that the equation was of the other form.  Hopefully 
that jumping between root counts only happens when two roots are very 
close to each other so that the choice is between listing "N" or 
"N+epsilon and N-epsilon" - I can live with that inaccuracy, but it 
seemed like the version in CC2D might choose between "it's either a 
single root of 4.25, or three roots of -127.3, 23.5, and 42.6" and I 
would scratch my head and think - wow, what a difference a bit made!


Luckily I don't think we actually ever relied on CC2D.solveCubic for 
correctness in any of our code, but it would be nice to fix it if a more 
stable method is available.


...jim

On 12/13/2010 12:23 PM, Denis Lila wrote:

Hi again.

I found an implementation of a closed form cubic root solver
(from graphics gems):
http://read.pudn.com/downloads21/sourcecode/graph/71499/gems/Roots3And4.c__.htm

I did some micro benchmarks, and it's about 25% slower than the one I have.
I'm thinking we should use it anyway because it's much better in every
other way: it finds all roots, it doesn't make its callers give it a root
array that is longer than the total number of roots, and most importantly,
it doesn't fail because of an iteration upper bound. From my testing, I noticed
that this happens too frequently for comfort in my cubicRootsInAB. I haven't
noticed any rendering artifacts caused by this, but I also haven't tried
rendering every possible curve and it's better to prevent bugs than fix them.

What do you think?

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-13 Thread Jim Graham

On 12/13/2010 10:54 AM, Denis Lila wrote:

Hi Jim.


With respect to finding a cubic root, currently you are doing that in
2 dimensions, but what if we converted to 1 dimension?
Consider that the control polygon is "fairly linear".  What if we
rotated our perspective so that it was horizontal and then squashed it
flat?  Consider instead a 1 dimensional bezier with control values
of:
(where |mn| is the length of the m->n control polygon of the original
curve - sum of all segments from point m to point n)
0.0, |01|, |02|, |03|


I had thought of something like this but I was afraid that the loss of
Curve.java:141-152 would hurt accuracy. I implemented this though, and
testing shows that that's not a problem. This should also double the
performance of the computation since we only run one cubic root finder,
and that was the major bottleneck.
I updated the webrev.


Great!


Should I remove some no longer needed methods, like getTCloseTo?


If you are confident that we don't need them any more (has that one ever 
really ever been added?  I think it is still in this same webrev review 
cycle so it would be "adding a method that was never used", no?)



Solve that 1 dimensional bezier for v=(leaflen*polylen)/linelen...


Don't you mean (targetLength - lenAtLastT) * polylen / leaflen?


I guess.  I lost track of the terms in the discussion.  What I meant was 
"the length of how far into this curve we need to go scaled by the 
difference between that curve's control polygon length and that curve's 
chord length".  The values you write seem to be those so I guess that's 
right, but I'd need to see it in situ to verify...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-13 Thread Denis Lila
Hi again.

I found an implementation of a closed form cubic root solver
(from graphics gems):
http://read.pudn.com/downloads21/sourcecode/graph/71499/gems/Roots3And4.c__.htm

I did some micro benchmarks, and it's about 25% slower than the one I have.
I'm thinking we should use it anyway because it's much better in every
other way: it finds all roots, it doesn't make its callers give it a root
array that is longer than the total number of roots, and most importantly,
it doesn't fail because of an iteration upper bound. From my testing, I noticed
that this happens too frequently for comfort in my cubicRootsInAB. I haven't
noticed any rendering artifacts caused by this, but I also haven't tried
rendering every possible curve and it's better to prevent bugs than fix them.

What do you think?

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-13 Thread Denis Lila
Hi Jim.

> With respect to finding a cubic root, currently you are doing that in
> 2 dimensions, but what if we converted to 1 dimension?
> Consider that the control polygon is "fairly linear".  What if we 
> rotated our perspective so that it was horizontal and then squashed it
> flat?  Consider instead a 1 dimensional bezier with control values
> of:
> (where |mn| is the length of the m->n control polygon of the original
> curve - sum of all segments from point m to point n)
> 0.0, |01|, |02|, |03|

I had thought of something like this but I was afraid that the loss of
Curve.java:141-152 would hurt accuracy. I implemented this though, and
testing shows that that's not a problem. This should also double the
performance of the computation since we only run one cubic root finder,
and that was the major bottleneck.
I updated the webrev.

Should I remove some no longer needed methods, like getTCloseTo?

> Solve that 1 dimensional bezier for v=(leaflen*polylen)/linelen...

Don't you mean (targetLength - lenAtLastT) * polylen / leaflen?

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-13 Thread Denis Lila
Hi Jim.

> Woohoo!

:)

> >> How often do we end up needing getTCloseTo in practice?
> >
> > It depends on the ratios of the lengths of the sides of the control
> > polygon. The closer they are to 1, the less we need it. I'm not
> sure
> > how to answer more precisely - for that I would need a
> representative
> > sample of curves drawn in practice.
> 
> I was thinking of, say, when applied to a circle.  Does that get by 
> without needing getTCloseTo?

I tested it on a couple of quarter circles of greatly varying radii,
and surprisingly, it does get by without needing getTCloseTo (or its
equivalent, after your "flattening" proposal in your previous e-mail).

> Good job!

Well, thanks for all your help.
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-10 Thread Jim Graham

Hi Denis,

The example I gave was intended to be very crude - I was simply 
describing the technique, but as I said it would require better math to 
really know what the right formula would be.


With respect to finding a cubic root, currently you are doing that in 2 
dimensions, but what if we converted to 1 dimension?


Consider that the control polygon is "fairly linear".  What if we 
rotated our perspective so that it was horizontal and then squashed it 
flat?  Consider instead a 1 dimensional bezier with control values of:


(where |mn| is the length of the m->n control polygon of the original 
curve - sum of all segments from point m to point n)


0.0, |01|, |02|, |03|

Solve that 1 dimensional bezier for v=(leaflen*polylen)/linelen...

...jim

On 12/10/2010 8:23 AM, Denis Lila wrote:

Hi Jim.


Actually, even if the lengths aren't close the lengths may give you
enough information about the acceleration along the curve that you can
do a decent approximation of the accelerated T value.  The T could be
biased by some formula that is weighted by the ratios of the control
polygon lengths.  As a very crude example, say you assumed that if the
scaled leaf length fell into the first polygon segment's length then t
should be proportionally a value from 0 to 1/3, and if it fell between
the first poly len and the second then it would be proportionally a
value from 1/3 to 2/3, etc.  The code might look like this:


I implemented this, and I'm not sure how to use this new approximation.
I mean, currently there are really two t's. The first one is the parameter
along the line connecting the 2 endpoints of the curve and the second
is the result that we return. We can't use this new approximation to
replace the first t, because for a curve like
(0,0),(1000,0),(1000,0),(1000,0) and a desired length of 500, the t
would be 1/6, so the computed (x,y) would be (1000/6,0) instead of
(500,0), which would be right (and which is what we compute now).

The only sensible way to use this kind of approximation would be as a
direct replacement for getTCloseTo. I tried that, and its quality to
speed ratio compared to getTCloseTo is remarkably good, but it's not
really usable because the differences are very noticeable.
I'll try to implement a good cubic root finder this weekend, and maybe
then getTCloseTo will be much faster and we won't have to worry about
this.


(Also, note that in the original code we probably would have just been
dashing along the flattened curve anyway and so we might have just
been
using the raw linear t in that case - so anything we do here is a
refinement of what we used to do and "icing on the cake" to some
extent)...


I'd say the dashing precision is better than what we used to have. It's
only slightly better, but even that is impressive when you consider that
we were doing up to 1024 subdivisions before, and now it's only 16, I think.

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-10 Thread Jim Graham

On 12/10/2010 8:27 AM, Denis Lila wrote:

Hi Jim.

Yes. The improvement shown by the bench marks is substantial.


Then this is great news!


Indeed :-)


Woohoo!


How often do we end up needing getTCloseTo in practice?


It depends on the ratios of the lengths of the sides of the control
polygon. The closer they are to 1, the less we need it. I'm not sure
how to answer more precisely - for that I would need a representative
sample of curves drawn in practice.


I was thinking of, say, when applied to a circle.  Does that get by 
without needing getTCloseTo?



However, I did run dashing and stroking benchmarks. Stroking shows
25% speedup (because of the refinements to the transform pipeline)
and cubic dashing has improved by 80%. Of course, all this is useless
if I've done something to make things look horrible, so I'm going to
run the gfx tests again.


Good job!

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-10 Thread Denis Lila
> Of course, all this is useless
> if I've done something to make things look horrible, so I'm going to
> run the gfx tests again.

I just ran them. All is good. The only change compared to the old test
result is that the number of dashed round rectangles that are identical
to what is produced by the closed source code has doubled. This isn't
all that significant, since it has gone from 4 identical images out of
162 total images to images to 8 out of 162, but it's still nice, and it
definitely means that nothing has gotten worse.

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-10 Thread Denis Lila
Hi Jim.

> By "without this optimization" do you mean back when you did a full
> scan for the proper T?

Yes. The improvement shown by the bench marks is substantial.

> Then this is great news!

Indeed :-)

> How often do we end up needing getTCloseTo in practice?

It depends on the ratios of the lengths of the sides of the control
polygon. The closer they are to 1, the less we need it. I'm not sure
how to answer more precisely - for that I would need a representative
sample of curves drawn in practice.

However, I did run dashing and stroking benchmarks. Stroking shows
25% speedup (because of the refinements to the transform pipeline)
and cubic dashing has improved by 80%. Of course, all this is useless
if I've done something to make things look horrible, so I'm going to
run the gfx tests again.

Regards,
Denis.

- "Jim Graham"  wrote:

> 
>   ...jim
> 
> On 12/8/2010 1:54 PM, Denis Lila wrote:
> > Hi Jim.
> >
> >>> How about "if the 3 segments of the control polygon are all close
> to
> >>> each other in length and angle", then the optimization applies. 
> Is
> >>> that easy to test?
> >>
> >> Hmm, that would actually be extremely easy to test and it would
> cost
> >> almost nothing. We already compute the control polygon lengths in
> onLeaf, and
> >> we're already assuming that every leaf is "flat enough", so we
> >> probably don't even need to check the angles. Comparing lengths
> should be good
> >> enough.
> >> I'll try this out.
> >
> > I implemented this and updated the webrev. I tested on a few curves
> with very high
> > and very low accelerations. The results were identical to what I
> used to get
> > without this optimization. When the curve has no acceleration, all
> calls of getTCloseTo
> > are skipped.
> >
> > Regards,
> > Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-10 Thread Denis Lila
Hi Jim.

> Actually, even if the lengths aren't close the lengths may give you 
> enough information about the acceleration along the curve that you can
> do a decent approximation of the accelerated T value.  The T could be
> biased by some formula that is weighted by the ratios of the control 
> polygon lengths.  As a very crude example, say you assumed that if the
> scaled leaf length fell into the first polygon segment's length then t
> should be proportionally a value from 0 to 1/3, and if it fell between
> the first poly len and the second then it would be proportionally a 
> value from 1/3 to 2/3, etc.  The code might look like this:

I implemented this, and I'm not sure how to use this new approximation.
I mean, currently there are really two t's. The first one is the parameter
along the line connecting the 2 endpoints of the curve and the second
is the result that we return. We can't use this new approximation to
replace the first t, because for a curve like
(0,0),(1000,0),(1000,0),(1000,0) and a desired length of 500, the t
would be 1/6, so the computed (x,y) would be (1000/6,0) instead of
(500,0), which would be right (and which is what we compute now).

The only sensible way to use this kind of approximation would be as a
direct replacement for getTCloseTo. I tried that, and its quality to
speed ratio compared to getTCloseTo is remarkably good, but it's not
really usable because the differences are very noticeable.
I'll try to implement a good cubic root finder this weekend, and maybe
then getTCloseTo will be much faster and we won't have to worry about
this.

> (Also, note that in the original code we probably would have just been
> dashing along the flattened curve anyway and so we might have just
> been 
> using the raw linear t in that case - so anything we do here is a 
> refinement of what we used to do and "icing on the cake" to some
> extent)...

I'd say the dashing precision is better than what we used to have. It's
only slightly better, but even that is impressive when you consider that
we were doing up to 1024 subdivisions before, and now it's only 16, I think.

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-08 Thread Jim Graham
By "without this optimization" do you mean back when you did a full scan 
for the proper T?  Then this is great news!


How often do we end up needing getTCloseTo in practice?

...jim

On 12/8/2010 1:54 PM, Denis Lila wrote:

Hi Jim.


How about "if the 3 segments of the control polygon are all close to
each other in length and angle", then the optimization applies.  Is
that easy to test?


Hmm, that would actually be extremely easy to test and it would cost
almost nothing. We already compute the control polygon lengths in onLeaf, and
we're already assuming that every leaf is "flat enough", so we
probably don't even need to check the angles. Comparing lengths should be good
enough.
I'll try this out.


I implemented this and updated the webrev. I tested on a few curves with very 
high
and very low accelerations. The results were identical to what I used to get
without this optimization. When the curve has no acceleration, all calls of 
getTCloseTo
are skipped.

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-08 Thread Jim Graham

Hi Denis,

On 12/8/2010 12:04 PM, Denis Lila wrote:

I'm not sure how the closed interval is awkward.  Isn't it just proper
choice of ">= and<= vs.>  and<" in the testing method?


In the filtering function, yes, but I was referring to cubicRootsInAB in
Helpers:122-133 where we iterate through intervals. For each interval,
we have the values of the function at the ends, and if the left one
is 0, we just add it as a zero and skip the call to CubicNewton. In order
to get roots in [A,B], we would either have to test both endpoints (which
would be more expensive and it would force us to find a way of avoiding
duplicate roots), or we would have to deal with the last interval as
a special case. The latter is not that bad, but it is more awkward than
what we have now.


Perhaps it would be better if RootsInAB would advertise that it is 
returning approximations of a high precision, but they won't be exact 
and roots near the endpoints may not be caught and so the caller should 
be prepared to evaluate the endpoints manually to see if they represent 
interesting values for the purposes of why the roots were requested.


And then do that in the functions that call it.


How about "if the 3 segments of the control polygon are all close to
each other in length and angle", then the optimization applies.  Is
that easy to test?


Hmm, that would actually be extremely easy to test and it would cost almost
nothing. We already compute the control polygon lengths in onLeaf, and
we're already assuming that every leaf is "flat enough", so we probably
don't even need to check the angles. Comparing lengths should be good enough.
I'll try this out.


Actually, even if the lengths aren't close the lengths may give you 
enough information about the acceleration along the curve that you can 
do a decent approximation of the accelerated T value.  The T could be 
biased by some formula that is weighted by the ratios of the control 
polygon lengths.  As a very crude example, say you assumed that if the 
scaled leaf length fell into the first polygon segment's length then t 
should be proportionally a value from 0 to 1/3, and if it fell between 
the first poly len and the second then it would be proportionally a 
value from 1/3 to 2/3, etc.  The code might look like this:


polylen = l01 + l12 + l23
linelen = l03
// If l01==l12==l23 then most of the following becomes
// a NOP and t=leaflen/linelen
polyleaflen = leaflen * polylen / linelen;
if polyleaflen < l01 then t = (polyleaflen/l01)/3
else if polyleaflen < l01+l12 then t = ((pll-l01)/l12 + 1)/3
else if t = ((pll-l01-l12)/l23)+2)/3

An even better approximation would involve some more math, but that 
might be better than simply using the linear interpolation along the 
segment connecting their endpoints.


(Also, note that in the original code we probably would have just been 
dashing along the flattened curve anyway and so we might have just been 
using the raw linear t in that case - so anything we do here is a 
refinement of what we used to do and "icing on the cake" to some extent)...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-08 Thread Denis Lila
Hi Jim.

> > How about "if the 3 segments of the control polygon are all close to
> > each other in length and angle", then the optimization applies.  Is
> > that easy to test?
> 
> Hmm, that would actually be extremely easy to test and it would cost
> almost nothing. We already compute the control polygon lengths in onLeaf, and
> we're already assuming that every leaf is "flat enough", so we
> probably don't even need to check the angles. Comparing lengths should be good
> enough.
> I'll try this out.

I implemented this and updated the webrev. I tested on a few curves with very 
high
and very low accelerations. The results were identical to what I used to get
without this optimization. When the curve has no acceleration, all calls of 
getTCloseTo
are skipped.

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-08 Thread Denis Lila
> I'm not sure how the closed interval is awkward.  Isn't it just proper
> choice of ">= and <= vs. > and <" in the testing method?

In the filtering function, yes, but I was referring to cubicRootsInAB in
Helpers:122-133 where we iterate through intervals. For each interval,
we have the values of the function at the ends, and if the left one
is 0, we just add it as a zero and skip the call to CubicNewton. In order
to get roots in [A,B], we would either have to test both endpoints (which
would be more expensive and it would force us to find a way of avoiding
duplicate roots), or we would have to deal with the last interval as
a special case. The latter is not that bad, but it is more awkward than
what we have now.

> How about "if the 3 segments of the control polygon are all close to 
> each other in length and angle", then the optimization applies.  Is
> that easy to test?

Hmm, that would actually be extremely easy to test and it would cost almost
nothing. We already compute the control polygon lengths in onLeaf, and
we're already assuming that every leaf is "flat enough", so we probably
don't even need to check the angles. Comparing lengths should be good enough.
I'll try this out.

Thank you,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-08 Thread Jim Graham

On 12/8/2010 9:37 AM, Denis Lila wrote:

Shouldn't it be [A, B]?


I thought about this when implementing it, but I don't think it mattered
whether it was closed or half open, and the closed interval would have been
somewhat more awkward to implement.


I'm not sure how the closed interval is awkward.  Isn't it just proper 
choice of ">= and <= vs. > and <" in the testing method?



getMaxAcc functions - don't we want the furthest value from 0,
positive or negative?  You are looking for most positive value
and negative accelerations are equally problematic, aren't they?
If so then these functions need some work.


You're right about both, but there's a much more serious problem that I
didn't think of when writing them: the value I compute in the if
statement in Dasher:355 is not an upper bound on the acceleration of
the curve. The acceleration is:
C'(t).dot(C''(t))/len(C'(t)) which in terms of the parameter polynomials is
(x'(t)*x''(t) + y'(t)*y''(t))/sqrt(x'(t)^2 + y'(t)^2)
What those functions would compute if they were "correct" would be
max(abs(x''(t))) and max(abs(y''(t))), and the sum of these is not
closely related to the maximum absolute acceleration, which is what we
want.
Without the upper bound property, I don't think it's a very meaningful
test, and I think we should abandon this optimization. Do you agree?


How about "if the 3 segments of the control polygon are all close to 
each other in length and angle", then the optimization applies.  Is that 
easy to test?


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-08 Thread Denis Lila
Hi Jim.

> The main problem is that "must" doesn't exist for IEEE floating point
> numbers.  You can find the root for one of the endpoints and it may 
> return "t = -.1" even though the value exactly matched the
> endpoint, but after all the math was said and done the answer
> it came up had the bit pattern for a tiny negative number, not
> 0 (or 1.0001).  That t value would be rejected and then you'd
> have no roots.

That's true. That's what I meant when I said "finite precision math
doesn't necessarily care what calculus says" ;-)

> It's not hurting anything and we may find it useful in other contexts.
> We probably should have put it in a more generic package.

Sounds good. I suppose we can move it if the need ever arises.

> Shouldn't it be [A, B]?

I thought about this when implementing it, but I don't think it mattered
whether it was closed or half open, and the closed interval would have been
somewhat more awkward to implement.

> getMaxAcc functions - don't we want the furthest value from 0,
> positive or negative?  You are looking for most positive value
> and negative accelerations are equally problematic, aren't they?
> If so then these functions need some work.

You're right about both, but there's a much more serious problem that I
didn't think of when writing them: the value I compute in the if
statement in Dasher:355 is not an upper bound on the acceleration of
the curve. The acceleration is:
C'(t).dot(C''(t))/len(C'(t)) which in terms of the parameter polynomials is
(x'(t)*x''(t) + y'(t)*y''(t))/sqrt(x'(t)^2 + y'(t)^2)
What those functions would compute if they were "correct" would be 
max(abs(x''(t))) and max(abs(y''(t))), and the sum of these is not
closely related to the maximum absolute acceleration, which is what we
want.
Without the upper bound property, I don't think it's a very meaningful
test, and I think we should abandon this optimization. Do you agree?

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-08 Thread Jim Graham

Hi Denis,

On 12/7/2010 10:47 AM, Denis Lila wrote:

Hi Jim.


I'm sure you will likely find a root, but the method you are using is
"roots*inAB*" which may throw the root out because it is out of range,
no?


I'm sure some roots will be thrown out, but I think in every call of
getTCloseTo there will be at least one root that isn't thrown out. That's
because our [A,B) is always [0, 1), and the point x,y that we pass
to getTCloseTo lies on the line joining the endpoints of the curve on
which getTCloseTo is called, so there must be some root in [0, 1). In fact,
there will be two roots, one for each parametric polynomial.


The main problem is that "must" doesn't exist for IEEE floating point 
numbers.  You can find the root for one of the endpoints and it may 
return "t = -.1" even though the value exactly matched the endpoint, 
but after all the math was said and done the answer it came up had the 
bit pattern for a tiny negative number, not 0 (or 1.0001).  That t value 
would be rejected and then you'd have no roots.



I implemented your way. I couldn't get rid of outat, however. In that case
where we have to apply a non orthogonal transform with no normalization we
want to just apply the transform at the end of stroking and feed stroker
untransformed paths. So, now I have both outat and strokerat. At least
one of them must always be null. In the case I mention above, strokerat will
be null, and the calls to *deltaTransformConsumer(pc2d, strokerat) won't
do anything, but the transformConsumer(pc2d, outat) call will take care of
the post stroker transformation.


Oh, I see, we have 3 possible chains:

   PI => AT => stroke => render
   PI => AT => normalize => 1/at => stroke => at => render
   PI => normalize => stroke => AT => render

So sometimes the final transform needs to be full and sometimes it needs 
to be delta.  It looks like it works.



I don't think the TranslateFilter will ever be used, because transformConsumer
is now called only with outat, which cannot be a translation. So we may want
to remove it.


It's not hurting anything and we may find it useful in other contexts. 
We probably should have put it in a more generic package.



I also fixed the filterOutNotInAB function. It was violating cubicRootsInAB's
spec by filtering out everything not in (A, B), as opposed to everything not
in [A, B), which is what it should do.


Shouldn't it be [A, B]?

New stuff:

Curve.java:

getMaxAcc functions - don't we want the furthest value from 0, positive 
or negative?  You are looking for most positive value and negative 
accelerations are equally problematic, aren't they?  If so then these 
functions need some work.


lines 118,141 - aren't these always true?  Should this be "&&"?

I also still need to look at the new Renderer stuff...

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-07 Thread Denis Lila
Hi Jim.

> I'm sure you will likely find a root, but the method you are using is
> "roots*inAB*" which may throw the root out because it is out of range,
> no?

I'm sure some roots will be thrown out, but I think in every call of
getTCloseTo there will be at least one root that isn't thrown out. That's
because our [A,B) is always [0, 1), and the point x,y that we pass
to getTCloseTo lies on the line joining the endpoints of the curve on
which getTCloseTo is called, so there must be some root in [0, 1). In fact,
there will be two roots, one for each parametric polynomial.

> Read the IEEE spec on NaN.  It's a special value that has this bizarre
> property that it is the only number that is not equal to itself. ;-) 
> In fact, the test for NaN is usually "if (x == x)  else ". 
> If you want to be explicit and formal then you can use the static 
> Float.isNaN() method (which is essentially that test - x!=x).

Interesting. I did not know that. I fixed these.

> A lot of the lines before you test MaxAcc are not needed unless you go
> into the if.  In particular, x,y,[xy][01] are only used if you call 
> getTCloseTo().

Fixed.

> Actually I think you've updated the AFD code so I should really take a
> look... :-(
> 
> ;-)

Oh, of course! I completely forgot about that. I think I only changed
the quadratic AFD code to take into account the constant acceleration of
quadratic curves.

> The problem is that normalization needs proper sub-pixel positioning
> so you need to hand it the true device space coordinates with proper 
> translation.  You need this:

Right, so I was wrong about normalization and translation being commutative.

> Using these two methods I don't think you need any transforms other
> than the original one - so all you need is "strokerat" which replaces both
> outat and inat and is either null (no special transform needed) or the
> original AT when special transforms are needed...

I implemented your way. I couldn't get rid of outat, however. In that case
where we have to apply a non orthogonal transform with no normalization we
want to just apply the transform at the end of stroking and feed stroker
untransformed paths. So, now I have both outat and strokerat. At least
one of them must always be null. In the case I mention above, strokerat will
be null, and the calls to *deltaTransformConsumer(pc2d, strokerat) won't
do anything, but the transformConsumer(pc2d, outat) call will take care of
the post stroker transformation.

I don't think the TranslateFilter will ever be used, because transformConsumer
is now called only with outat, which cannot be a translation. So we may want
to remove it.

I also fixed the filterOutNotInAB function. It was violating cubicRootsInAB's
spec by filtering out everything not in (A, B), as opposed to everything not
in [A, B), which is what it should do.

I uploaded the new webrev.

Thank you,
Denis.

- "Jim Graham"  wrote:

> Hi Denis,
> 
> On 12/6/2010 4:21 PM, Denis Lila wrote:
> > Hi Jim.
> >
> >> line 134 - what if numx or numy == 0 because only roots outside
> [0,1]
> >> were found?
> >
> > In this case lines 151-162 will execute, and nothing is wrong. The
> only
> > problem is when both numx and numy are 0. This is certainly possible
> in
> > the general case (but only with quadratic curves), but the way we're
> using
> > this function, the intermediate value theorem guarantees at least
> one root
> > will be found. Of course, finite precision math doesn't necessarily
> care
> > what calculus has to say, but in this case I can't see how the root
> computation
> > could fail. On the other hand, one could argue that this is exactly
> why we need
> > to defend against this case, so I've added some checks.
> 
> 
> >> line 145 - what if d0 and d1 are both 0?  NaN results.  What if
> you
> >> just used a simple "d0<  d1 ? tx : ty" - how far off would that
> be?
> >
> > I tried that out on a curve with very high acceleration, and it
> makes a noticeable
> > difference. So, instead I'm using
> > if (w0 == Float.NaN) {
> > return tx;
> > }
> 
> Same thing on Dasher line 363 where you also test for NaN.
> 
> >> line 357 - another optimization would be to check the acceleration
> in
> >> the range and if it is below a threshold then simply use the t
> from
> >> line 348 as the t for the split
> >
> > I like this. I tried implementing it. I haven't tested it yet
> though, and
> > I still have to pick a good cutoff acceleration. For now I'm using
> 1/leaflen.
> > We definitely don't want it to just be a constant, since the longer
> the leaf,
> > the worse it will be to allow acceleration, so for longer leaves we
> want to
> > skip the getTCloseTo call only if the acceleration is smaller.
> 
> >> Renderer.java:  Is this just a straight copy of what I was working
> on?
> >
> > I'm pretty sure it is, yes.
> 
> 
> >> TransformingPathConsumer2D:
> >>
> >> Any thoughts on whether we need translation in the scale filter
> and
> >> whe

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-06 Thread Jim Graham

Hi Denis,

On 12/6/2010 4:21 PM, Denis Lila wrote:

Hi Jim.


line 134 - what if numx or numy == 0 because only roots outside [0,1]
were found?


In this case lines 151-162 will execute, and nothing is wrong. The only
problem is when both numx and numy are 0. This is certainly possible in
the general case (but only with quadratic curves), but the way we're using
this function, the intermediate value theorem guarantees at least one root
will be found. Of course, finite precision math doesn't necessarily care
what calculus has to say, but in this case I can't see how the root computation
could fail. On the other hand, one could argue that this is exactly why we need
to defend against this case, so I've added some checks.


I'm sure you will likely find a root, but the method you are using is 
"roots*inAB*" which may throw the root out because it is out of range, no?


In looking at that method it looks like the cubic handling code tries 0 
and 1 in addition to the critical points it finds using a root, but the 
quadratic code that it chooses if a==0 will throw out all roots outside 
the 0,1 range and may end up with 0 answers.  The cubic code further can 
reject all of the points (if they are all non-zero and same sign) and 
also return no answers, but may have fewer cases where it would do that.


Still, my point was not that you might fail to find a root, but that the 
roots may get rejected and end up with no answers in range.



line 145 - what if d0 and d1 are both 0?  NaN results.  What if you
just used a simple "d0<  d1 ? tx : ty" - how far off would that be?


I tried that out on a curve with very high acceleration, and it makes a 
noticeable
difference. So, instead I'm using
if (w0 == Float.NaN) {
return tx;
}


Read the IEEE spec on NaN.  It's a special value that has this bizarre 
property that it is the only number that is not equal to itself. ;-)  In 
fact, the test for NaN is usually "if (x == x)  else ".  If 
you want to be explicit and formal then you can use the static 
Float.isNaN() method (which is essentially that test - x!=x).


Same thing on Dasher line 363 where you also test for NaN.


line 357 - another optimization would be to check the acceleration in
the range and if it is below a threshold then simply use the t from
line 348 as the t for the split


I like this. I tried implementing it. I haven't tested it yet though, and
I still have to pick a good cutoff acceleration. For now I'm using 1/leaflen.
We definitely don't want it to just be a constant, since the longer the leaf,
the worse it will be to allow acceleration, so for longer leaves we want to
skip the getTCloseTo call only if the acceleration is smaller.


A lot of the lines before you test MaxAcc are not needed unless you go 
into the if.  In particular, x,y,[xy][01] are only used if you call 
getTCloseTo().



Renderer.java:  Is this just a straight copy of what I was working on?


I'm pretty sure it is, yes.


Actually I think you've updated the AFD code so I should really take a 
look... :-(


;-)


TransformingPathConsumer2D:

Any thoughts on whether we need translation in the scale filter and
whether a 4-element non-translation filter would be useful?  I think
the current code that drives this passes in the full transform and its
inverse, but it could just as easily do delta transforms instead and
save the adding of the translation components...


I thought about this long ago, but I wasn't sure it wouldn't break anything.
Today, I got a bit more formal with the math, and I think it's ok
to eliminate the translation. I've implemented this (though, I haven't had
time to test it yet. That's for tomorrow).


Right now you have (note that the common terminology for "transform 
without translation" is "delta transform"):


PathIterator
=> DeltaAT => Normalize
=> DeltaInverseAT => strokers
=> FullAT => renderer

The problem is that normalization needs proper sub-pixel positioning so 
you need to hand it the true device space coordinates with proper 
translation.  You need this:


PathIterator
=> FullAT => Normalize
=> DeltaInverseAT => strokers
=> DeltaAT => renderer

I would skip the creation of atNotTranslationPart and just inverse the 
original transform (since I don't think the inversion is affected by 
translation - you can see this in the calculations in 
AT.createInverse()), and then have the transforming consumers apply a 
delta transform - i.e. create a "TPC2D.deltaTransformConsumer()" method 
which would apply just the non-translation parts of AT to the consumer.


If you want to get really fancy with optimizations you could have an 
"inverseDeltaTransformConsumer() method that would calculate the 
inversions on the fly to avoid construction of a scratch AT.  Since it 
is just "weird transpose with signs and divide by the determinant" in 
the most general case and even s

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-06 Thread Denis Lila
Hi Jim.

> line 134 - what if numx or numy == 0 because only roots outside [0,1]
> were found?

In this case lines 151-162 will execute, and nothing is wrong. The only
problem is when both numx and numy are 0. This is certainly possible in
the general case (but only with quadratic curves), but the way we're using
this function, the intermediate value theorem guarantees at least one root
will be found. Of course, finite precision math doesn't necessarily care
what calculus has to say, but in this case I can't see how the root computation
could fail. On the other hand, one could argue that this is exactly why we need
to defend against this case, so I've added some checks. 

> line 145 - what if d0 and d1 are both 0?  NaN results.  What if you
> just used a simple "d0 < d1 ? tx : ty" - how far off would that be?

I tried that out on a curve with very high acceleration, and it makes a 
noticeable
difference. So, instead I'm using 
if (w0 == Float.NaN) {
return tx;
}

> line 152,154 - you are likely picking the most negative distance here,
> not the closest to zero.  abs() might help.

Good point.

> line 187 - is that correct?  Shouldn't it be "(...) + 0.5*(...)"?
>   - if that is true then perhaps the change in scale didn't
> really reduce the number of multiplies?

You're right.

> line 220 - any reason to move this method?

No, just an accident while I was experimenting with different algorithms.

> line 238 - ah, perhaps the lion's share of the performance
> improvement?

Indeed. Well, maybe about 75% of it. A surprisingly large part of the
improvement came from changing all the Math.hypot calls to Math.sqrt calls.

> line 357 - another optimization would be to check the acceleration in
> the range and if it is below a threshold then simply use the t from
> line 348 as the t for the split

I like this. I tried implementing it. I haven't tested it yet though, and
I still have to pick a good cutoff acceleration. For now I'm using 1/leaflen.
We definitely don't want it to just be a constant, since the longer the leaf,
the worse it will be to allow acceleration, so for longer leaves we want to
skip the getTCloseTo call only if the acceleration is smaller.

> line 83 - conflicts with test at line 89

Fixed it.

> Renderer.java:  Is this just a straight copy of what I was working on?

I'm pretty sure it is, yes.

> I hate to be a whiner, but I'd rather avoid having to go over every
> line and check it... ;-)

I can't blame you :)

> TransformingPathConsumer2D:
> 
> Any thoughts on whether we need translation in the scale filter and 
> whether a 4-element non-translation filter would be useful?  I think
> the current code that drives this passes in the full transform and its 
> inverse, but it could just as easily do delta transforms instead and 
> save the adding of the translation components...

I thought about this long ago, but I wasn't sure it wouldn't break anything.
Today, I got a bit more formal with the math, and I think it's ok
to eliminate the translation. I've implemented this (though, I haven't had
time to test it yet. That's for tomorrow).

The link is: http://icedtea.classpath.org/~dlila/webrevs/perfWebrev/webrev/

Thank you,
Denis.

- "Jim Graham"  wrote:

> 
>   ...jim
> 
> On 12/1/2010 9:19 AM, Denis Lila wrote:
> > Hi Jim.
> >
> > About two weeks or so ago I replied to one of your very old e-mails
> > about dashing performance and I included a link to the webrev:
> > http://icedtea.classpath.org/~dlila/webrevs/perfWebrev/webrev/
> >
> > I suspect you might have missed it, so that's why I'm writing this.
> > If you haven't, I apologize for this e-mail.
> > Anyway, here's the e-mail:
> >
> http://mail.openjdk.java.net/pipermail/2d-dev/2010-November/001654.html
> >
> > I also just found a bug in the cubic roots math: in the case where
> the function
> > was quadratic, I was forgetting to filter the computed roots to
> include
> > only ones in the range [A, B).
> >
> > Regards,
> > Denis.
> >
> > - "Jim Graham"  wrote:


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-01 Thread Jim Graham

Hi Denis,

Yes, I remember this now that you reminded me.  I'm sorry for having let 
it slide the first time... :-(


Curve.java:

line 134 - what if numx or numy == 0 because only roots outside [0,1] 
were found?


line 145 - what if d0 and d1 are both 0?  NaN results.  What if you just 
used a simple "d0 < d1 ? tx : ty" - how far off would that be?


line 152,154 - you are likely picking the most negative distance here, 
not the closest to zero.  abs() might help.


line 187 - is that correct?  Shouldn't it be "(...) + 0.5*(...)"?
 - if that is true then perhaps the change in scale didn't
   really reduce the number of multiplies?

Dasher.java:

line 220 - any reason to move this method?

line 238 - ah, perhaps the lion's share of the performance improvement?

line 357 - another optimization would be to check the acceleration in 
the range and if it is below a threshold then simply use the t from line 
348 as the t for the split


line 378 - comment applies to deleted function
 - also, maybe move bsc declaration up closer to next() method?

Helpers.java:

line 83 - conflicts with test at line 89

Renderer.java:  Is this just a straight copy of what I was working on? 
I hate to be a whiner, but I'd rather avoid having to go over every line 
and check it... ;-)


TransformingPathConsumer2D:

Any thoughts on whether we need translation in the scale filter and 
whether a 4-element non-translation filter would be useful?  I think the 
current code that drives this passes in the full transform and its 
inverse, but it could just as easily do delta transforms instead and 
save the adding of the translation components...


...jim

On 12/1/2010 9:19 AM, Denis Lila wrote:

Hi Jim.

About two weeks or so ago I replied to one of your very old e-mails
about dashing performance and I included a link to the webrev:
http://icedtea.classpath.org/~dlila/webrevs/perfWebrev/webrev/

I suspect you might have missed it, so that's why I'm writing this.
If you haven't, I apologize for this e-mail.
Anyway, here's the e-mail:
http://mail.openjdk.java.net/pipermail/2d-dev/2010-November/001654.html

I also just found a bug in the cubic roots math: in the case where the function
was quadratic, I was forgetting to filter the computed roots to include
only ones in the range [A, B).

Regards,
Denis.

- "Jim Graham"  wrote:


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-01 Thread Denis Lila
Hi Jim.

About two weeks or so ago I replied to one of your very old e-mails
about dashing performance and I included a link to the webrev:
http://icedtea.classpath.org/~dlila/webrevs/perfWebrev/webrev/

I suspect you might have missed it, so that's why I'm writing this.
If you haven't, I apologize for this e-mail.
Anyway, here's the e-mail:
http://mail.openjdk.java.net/pipermail/2d-dev/2010-November/001654.html

I also just found a bug in the cubic roots math: in the case where the function
was quadratic, I was forgetting to filter the computed roots to include
only ones in the range [A, B).

Regards,
Denis.

- "Jim Graham"  wrote:


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-18 Thread Denis Lila
A few corrections/completions:

> I tried to do this. I used the netbeans compiler
netbeans *profiler*.

> I tried to implement something like this. What I did was: I reduce the
> length of the buckets array to have only one bucket per pixel row. I removed
> the array that kept track of how many lines were in each bucket. In next()
> I iterated through the bucket for the current pixel row. For each line, I
> iterated through all the scanlines in the current pixel row that it crossed. 
> For each
> of those, I added a crossing (I still kept these sorted using insertion sort,
> although this might have hurt me - simply quicksorting all the crossings just 
> before
> returning from next() might have been better). The format of the crossings 
> was like so:
> the SUBPIXEL_Y_POSITIONS most insignificant bits were reserved for the
> y coordinate within the current pixel row of the crossing. The next most
> insignificant bit was for the orientation. The remaining bits were for the 
> actual crossing
> value.

I neglected to mention how these crossings were used in _end_rendering().
I just unpack them and keep a float[] sums array of length SUBPIXEL_Y_POSITIONS.
This replaces the sum variable. From here. So, if a crossing is on scanline i of
the current pixel row, then sum[i] is used. Everything else is the same.

Regards,
Denis.

- "Denis Lila"  wrote:

> Hi Jim.
> 
> I have a new webrev:
> http://icedtea.classpath.org/~dlila/webrevs/perfWebrev/webrev/
> 


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-18 Thread Denis Lila
Hi Jim.

I have a new webrev: 
http://icedtea.classpath.org/~dlila/webrevs/perfWebrev/webrev/

> How about looking more at the stroking end of the process and I'll dig
> a little more into optimal rasterization code.  I have a lot of
> experience with optimizing rasterizer code (and JNI if it comes to that), but
> very little with the curve manipulations involved in stroking (math is so 
> *hard* at my age ;-)...

I tried to do this. I used the netbeans compiler, and it said that a large
chunk of the time (about 12% is spent in curveTo). curveTo does almost nothing:
it just packs up the curve array and delegates to somethingTo. This makes me 
think
that there's not a whole lot that can be done to improve Stroker's performance 
(I'm
ok with this, since J2DBench and Java2DDemo animation frame rates both say that
non antialiased and non dashed performance is better than even closed source 
java).
I did make one small change though: I inlined the calls to dxat, dyat, ddxat, 
ddyat
in ROCsq because the profiler said that a lot of time was spent there. This 
made a
surprisingly large difference (but still not that great overall).

I also fixed the dashing performance degradation. I removed the binary sort, and
am now using getCubicRoots to find the t where to split. Another hugely 
significant
change was using Math.sqrt instead of Math.hypot in the implementation of 
Helpers.linelen.
I had been meaning to do this for a while, since sqrt is about 100 times faster 
on my machine
than hypot, but I didn't realize it would have such a large impact on dashing. 
Anyway, dashing
is now much, much faster than before. It is even faster than the flattening 
version we used
to use. The precision might not be as good as the current, slow implementation, 
but it's
only noticeable for curves with a lot of acceleration, and even then only if 
you do
a side by side comparison of the 2 implementations. The benchmarks display a 
200%-500%
improvement, so I think it is well worth it. Unfortunately curve dashing is 
still a bit
slower than the closed source counterpart, but not by much.

I also tweaked the AFD algorithm for quadratic curves. It's a bit faster now.

A while ago you made a suggestion on how to improve anti aliasing performance:

> Here are my thoughts:
> 
>   - Currently Renderer has more stages than we probably should have:
>for (each full pixel row) {
>for (each subpixel row) {
>for (each curve on subpixel row) {
>convert curve into crossing
>crossing is (subpixelx:31 + dir:1)
>}
>sort crossings for subpixel row
>apply wind rule and convert crossings into alpha deltas
>}
>convert alpha deltas into run lengths
>}
>for (each tile) {
>convert run lengths into alphas
>}
>   I'm thinking we should be able to get rid of a couple of those stages...
> 
>   - One alternate process would be:
>(all work is done in the tail end of the cache itself)
>for (each full pixel row) {
>for (each curve on full pixel row) {
>convert curve into N subpixel crossings
>(subpixelx:28 + subpixelfracty:3 + dir:1)
>}
>}
>sort all crossings for entire pixel row
>maybe collapse raw crossings into modified accumulated crossings
>record start of row in index array
>for (each tile) {
>convert raw or collapsed crossings directly into alphas
>}
>   Note that we could simply leave the raw crossings in the cache and then
>   process them in the tile generator, but that would require more storage
>   in the cache since a given path would tend to have 8 different entries
>   as it crossed every subpixel row.  If we collapse them, then we can sum
>   the entries for a given x coordinate into a single entry and store:
>(pixelx:25 + coveragedelta:7)
>   where the coverage delta is a signed quantity up to N_SUBPIX^2 so it
>   uses the same storage we needed for the subpixel parts of both X and Y
>   plus the direction bit.  It likely needs a paired value in the next x
>   pixel location just like our current "alpha deltas" needs as well.  If
>   we wanted to optimize that then we could devote one more bit to "the
>   next pixel will consume all of the remainder of the N^2 coverage", but
>   there would be cases where that would not be true (such as if the pixel
>   row has only partial vertical coverage by the path).  It's probably
>   simpler to just have deltas for every pixel.
> 
>   The storage here would likely be similar to the storage used for the
>   current cache since the current RLE cache uses 2 full ints to store a
>   coverage and a 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-10 Thread Jim Graham
FYI - we have a bug to integrate some optimizations from JDK6 for wide 
lines and transformed rectangles.  In 6 I did some work a year ago or so 
to detect simply wide lines and transformed rectangles and to issue a 
"fillParallelogram" internal method.  OGL and D3D can then implement 
these directly and I wrote new software loops for doing pgrams as well. 
 We are right now going through our list or 6 bugs and integrating them 
into 7 so this code should get integrated in the next month or so.  At 
that point I think that horizontal lines may be moot...


...jim

On 11/10/2010 2:01 PM, Denis Lila wrote:

Hi Jim.


- get rid of edgeMxy in all methods but addLine()
- addLine computes min/max of first/lastScanline
- addLine also computes min/max of x1,x2 values

this turned out to be just about the same speed for my FX rendering
version (which I believe is more sensitive than the way it is
integrated
into JDK, so it should be even less noticeable in JDK).  It also paved
the way for a couple of other optimizations that ended up netting
about 1FPS for my current test case that I use so I'm happy for now.
The code is a lot simpler now...


I also implemented what you describe and those are exactly my results too.
I implemented my ideas for optimizing edgeM[in|ax]Y too, but it turned out
not to make any difference whatsoever.

I should note that my benchmarks say the performance on horizontal lines has
decreased by 32% compared to the version where we qsorted everything. The
benchmark report says the overall performance has stayed the same because
every test other than horizontal lines is performing better by about 2-6%.

Regards,
Denis.

- "Jim Graham"  wrote:


I ended up going with:

...jim

On 11/9/2010 3:26 PM, Denis Lila wrote:

Hi again.

I just thought of this: if we're really concerned about the

accuracy

of the edgeMinX edgeMaxX variables, we could find the curves'
critical points and use them to compute the min/max X values. After

all,

we're creating (or rather "setting") the Curve objects anyway. This

isn't

as fast as using the bounding boxes, but it's close and much more

accurate.


Regards,
Denis.

- "Denis Lila"   wrote:


Hi Jim.


All lines generated from a given "allegedly monotonic" curve are
recorded with the same "or" (orientation) value.  But, if the

curves

are not truly monotonic then it might be theoretically possible

to

generate a line that is backwards with respect to the expected

orientation.  It

would then get recorded in the edges array with the wrong

orientation

and slope and then rasterization might unravel.


I see. In that case, I think it's a good idea if we don't make

curves

"monotonic". I already did this, by moving the edgeMin/axX/Y

handling

and orientation computations in addLine. This did make it slower
compared
to the file you sent me, but only by very, very little. Curves

were

affected the most, and they were only 1% slower. I think we can

handle

this, especially since lines were about 1% faster. The code is also

70

lines shorter.

The edgeM* members are used only so we don't have to iterate

through

every
scanline if this is not necessary, and so that we can tell

PiscesCache

that the bounding box is smaller than what Renderer is given.

However,

now
that we keep the bucket list, I think it would be more efficient if

we

got rid if EdgeM[in|ax]Y and simply computed the y bounds by

looking

at the
head and tail of the bucket list.
Also, perhaps we can keep track of edgeM[in|ax]X using the

bounding

boxes
of curves, instead of the lines in the flattened curves. This

would

not
be accurate, but I don't think it would affect rendering. It would
simply
result in a few more alpha boxes than necessary. I don't think

these

would
be too bad, because mostly they're just going to be all 0 so they

will

be skipped because getTypicalAlpha() is now implemented.
How do you think we should handle these 4 variables?

Thank you,
Denis.

- "Jim Graham"   wrote:


Hi Denis,

On 11/8/2010 2:39 PM, Denis Lila wrote:

Finally, I discovered (while testing for other problems) that

the

curves are not truly monotonic after slicing them.  I realized

this

years ago

when I was writing my Area code (see sun.awt.geom.Curve) and

put

in

tweaking code to make them monotonic after they were split.

They

are

never off by more than a few bits, but you can't trust the

curve

splitting math to generate purely monotonic segments based on a

t

generated by some unrelated math.  Sometimes the truly

horizontal

or

vertical t value requires more precision than a float (or even

a

double) can provide and splitting at the highest representable

float less than

the t value produces a pair of curves on one side of the hill

and

splitting at the next float value (which is greater than the

true

t

value) produces curves on the other side of the hill.  Also,

when

the

curve has been split a few times already,

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-10 Thread Denis Lila
Hi Jim.

> - get rid of edgeMxy in all methods but addLine()
> - addLine computes min/max of first/lastScanline
> - addLine also computes min/max of x1,x2 values
> 
> this turned out to be just about the same speed for my FX rendering 
> version (which I believe is more sensitive than the way it is
> integrated 
> into JDK, so it should be even less noticeable in JDK).  It also paved
> the way for a couple of other optimizations that ended up netting
> about 1FPS for my current test case that I use so I'm happy for now.
> The code is a lot simpler now...

I also implemented what you describe and those are exactly my results too.
I implemented my ideas for optimizing edgeM[in|ax]Y too, but it turned out
not to make any difference whatsoever.

I should note that my benchmarks say the performance on horizontal lines has
decreased by 32% compared to the version where we qsorted everything. The
benchmark report says the overall performance has stayed the same because
every test other than horizontal lines is performing better by about 2-6%.

Regards,
Denis.

- "Jim Graham"  wrote:

> I ended up going with:
> 
>   ...jim
> 
> On 11/9/2010 3:26 PM, Denis Lila wrote:
> > Hi again.
> >
> > I just thought of this: if we're really concerned about the
> accuracy
> > of the edgeMinX edgeMaxX variables, we could find the curves'
> > critical points and use them to compute the min/max X values. After
> all,
> > we're creating (or rather "setting") the Curve objects anyway. This
> isn't
> > as fast as using the bounding boxes, but it's close and much more
> accurate.
> >
> > Regards,
> > Denis.
> >
> > - "Denis Lila"  wrote:
> >
> >> Hi Jim.
> >>
> >>> All lines generated from a given "allegedly monotonic" curve are
> >>> recorded with the same "or" (orientation) value.  But, if the
> curves
> >>> are not truly monotonic then it might be theoretically possible
> to
> >>> generate a line that is backwards with respect to the expected
> >> orientation.  It
> >>> would then get recorded in the edges array with the wrong
> >> orientation
> >>> and slope and then rasterization might unravel.
> >>
> >> I see. In that case, I think it's a good idea if we don't make
> curves
> >> "monotonic". I already did this, by moving the edgeMin/axX/Y
> handling
> >> and orientation computations in addLine. This did make it slower
> >> compared
> >> to the file you sent me, but only by very, very little. Curves
> were
> >> affected the most, and they were only 1% slower. I think we can
> handle
> >> this, especially since lines were about 1% faster. The code is also
> 70
> >> lines shorter.
> >>
> >> The edgeM* members are used only so we don't have to iterate
> through
> >> every
> >> scanline if this is not necessary, and so that we can tell
> PiscesCache
> >> that the bounding box is smaller than what Renderer is given.
> However,
> >> now
> >> that we keep the bucket list, I think it would be more efficient if
> we
> >> got rid if EdgeM[in|ax]Y and simply computed the y bounds by
> looking
> >> at the
> >> head and tail of the bucket list.
> >> Also, perhaps we can keep track of edgeM[in|ax]X using the
> bounding
> >> boxes
> >> of curves, instead of the lines in the flattened curves. This
> would
> >> not
> >> be accurate, but I don't think it would affect rendering. It would
> >> simply
> >> result in a few more alpha boxes than necessary. I don't think
> these
> >> would
> >> be too bad, because mostly they're just going to be all 0 so they
> will
> >> be skipped because getTypicalAlpha() is now implemented.
> >> How do you think we should handle these 4 variables?
> >>
> >> Thank you,
> >> Denis.
> >>
> >> - "Jim Graham"  wrote:
> >>
> >>> Hi Denis,
> >>>
> >>> On 11/8/2010 2:39 PM, Denis Lila wrote:
> > Finally, I discovered (while testing for other problems) that
> the
> > curves are not truly monotonic after slicing them.  I realized
> >> this
> >>> years ago
> > when I was writing my Area code (see sun.awt.geom.Curve) and
> put
> >>> in
> > tweaking code to make them monotonic after they were split. 
> They
> >>> are
> > never off by more than a few bits, but you can't trust the
> curve
> > splitting math to generate purely monotonic segments based on a
> t
> > generated by some unrelated math.  Sometimes the truly
> horizontal
> >>> or
> > vertical t value requires more precision than a float (or even
> a
> > double) can provide and splitting at the highest representable
> >>> float less than
> > the t value produces a pair of curves on one side of the hill
> and
> > splitting at the next float value (which is greater than the
> true
> >>> t
> > value) produces curves on the other side of the hill.  Also,
> when
> >>> the
> > curve has been split a few times already, the t values loose
> >>> accuracy
> > with each split.  This will all be moot if I can eliminate the
> > splitting code from the renderer, but it may also play a factor
> >> 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-10 Thread Denis Lila
Hi Jim.

> Did you have to modify the AFD code for this (in terms of changing
> their limit constants to get good results)?

No, I didn't. By handling non monotonic curves, the AFD algorithm
is going through more iterations, but the only way in which this
could be a problem is through accumulation of numerical inaccuracies,
and I don't think we do enough iterations for this to start causing
perceptible problems. I haven't noticed anything in all my testing.

Regards,
Denis.

- "Jim Graham"  wrote:

> Hi Denis,
> 
> On 11/9/2010 3:06 PM, Denis Lila wrote:
> > I see. In that case, I think it's a good idea if we don't make
> curves
> > "monotonic". I already did this, by moving the edgeMin/axX/Y
> handling
> > and orientation computations in addLine. This did make it slower
> compared
> > to the file you sent me, but only by very, very little. Curves were
> > affected the most, and they were only 1% slower. I think we can
> handle
> > this, especially since lines were about 1% faster. The code is also
> 70
> > lines shorter.
> 
> > The edgeM* members are used only so we don't have to iterate through
> every
> > scanline if this is not necessary, and so that we can tell
> PiscesCache
> > that the bounding box is smaller than what Renderer is given.
> However, now
> > that we keep the bucket list, I think it would be more efficient if
> we
> > got rid if EdgeM[in|ax]Y and simply computed the y bounds by looking
> at the
> > head and tail of the bucket list.
> 
> That makes sense.  We calculate that per-edge anyway so the edgeMy 
> constants are redundant.
> 
> > Also, perhaps we can keep track of edgeM[in|ax]X using the bounding
> boxes
> > of curves, instead of the lines in the flattened curves. This would
> not
> > be accurate, but I don't think it would affect rendering. It would
> simply
> > result in a few more alpha boxes than necessary. I don't think these
> would
> > be too bad, because mostly they're just going to be all 0 so they
> will
> > be skipped because getTypicalAlpha() is now implemented.
> > How do you think we should handle these 4 variables?
> 
> I think this is probably OK, but let me play with it a bit and see
> what 
> I come up with.  For one thing, the extra "slop" may not be large
> enough 
> to trigger a full tile of 0's - there would have to be 32-pixel
> borders 
> for that to happen.
> 
> If we get rid of the redundant edgeMy calculations then we might be
> able 
> to do edgeMx calculations on each edge without losing any
> performance...
> 
>   ...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-09 Thread Jim Graham

I ended up going with:

- get rid of edgeMxy in all methods but addLine()
- addLine computes min/max of first/lastScanline
- addLine also computes min/max of x1,x2 values

this turned out to be just about the same speed for my FX rendering 
version (which I believe is more sensitive than the way it is integrated 
into JDK, so it should be even less noticeable in JDK).  It also paved 
the way for a couple of other optimizations that ended up netting about 
1FPS for my current test case that I use so I'm happy for now.


The code is a lot simpler now...

...jim

On 11/9/2010 3:26 PM, Denis Lila wrote:

Hi again.

I just thought of this: if we're really concerned about the accuracy
of the edgeMinX edgeMaxX variables, we could find the curves'
critical points and use them to compute the min/max X values. After all,
we're creating (or rather "setting") the Curve objects anyway. This isn't
as fast as using the bounding boxes, but it's close and much more accurate.

Regards,
Denis.

- "Denis Lila"  wrote:


Hi Jim.


All lines generated from a given "allegedly monotonic" curve are
recorded with the same "or" (orientation) value.  But, if the curves
are not truly monotonic then it might be theoretically possible to
generate a line that is backwards with respect to the expected

orientation.  It

would then get recorded in the edges array with the wrong

orientation

and slope and then rasterization might unravel.


I see. In that case, I think it's a good idea if we don't make curves
"monotonic". I already did this, by moving the edgeMin/axX/Y handling
and orientation computations in addLine. This did make it slower
compared
to the file you sent me, but only by very, very little. Curves were
affected the most, and they were only 1% slower. I think we can handle
this, especially since lines were about 1% faster. The code is also 70
lines shorter.

The edgeM* members are used only so we don't have to iterate through
every
scanline if this is not necessary, and so that we can tell PiscesCache
that the bounding box is smaller than what Renderer is given. However,
now
that we keep the bucket list, I think it would be more efficient if we
got rid if EdgeM[in|ax]Y and simply computed the y bounds by looking
at the
head and tail of the bucket list.
Also, perhaps we can keep track of edgeM[in|ax]X using the bounding
boxes
of curves, instead of the lines in the flattened curves. This would
not
be accurate, but I don't think it would affect rendering. It would
simply
result in a few more alpha boxes than necessary. I don't think these
would
be too bad, because mostly they're just going to be all 0 so they will
be skipped because getTypicalAlpha() is now implemented.
How do you think we should handle these 4 variables?

Thank you,
Denis.

- "Jim Graham"  wrote:


Hi Denis,

On 11/8/2010 2:39 PM, Denis Lila wrote:

Finally, I discovered (while testing for other problems) that the
curves are not truly monotonic after slicing them.  I realized

this

years ago

when I was writing my Area code (see sun.awt.geom.Curve) and put

in

tweaking code to make them monotonic after they were split.  They

are

never off by more than a few bits, but you can't trust the curve
splitting math to generate purely monotonic segments based on a t
generated by some unrelated math.  Sometimes the truly horizontal

or

vertical t value requires more precision than a float (or even a
double) can provide and splitting at the highest representable

float less than

the t value produces a pair of curves on one side of the hill and
splitting at the next float value (which is greater than the true

t

value) produces curves on the other side of the hill.  Also, when

the

curve has been split a few times already, the t values loose

accuracy

with each split.  This will all be moot if I can eliminate the
splitting code from the renderer, but it may also play a factor

in

the

stroke/dash
code...


Making curves monotonic is only used for optimization purposes,
so it can't see how it would affect rendering correctness.


Fortunately, the non-monotonicity is limited to a few bits of
precision
so this may never generate an errant edge in practice unless
flattening
gets really fine-grained...

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-09 Thread Jim Graham

Hi Denis,

On 11/9/2010 3:06 PM, Denis Lila wrote:

I see. In that case, I think it's a good idea if we don't make curves
"monotonic". I already did this, by moving the edgeMin/axX/Y handling
and orientation computations in addLine. This did make it slower compared
to the file you sent me, but only by very, very little. Curves were
affected the most, and they were only 1% slower. I think we can handle
this, especially since lines were about 1% faster. The code is also 70
lines shorter.


Did you have to modify the AFD code for this (in terms of changing their 
limit constants to get good results)?



The edgeM* members are used only so we don't have to iterate through every
scanline if this is not necessary, and so that we can tell PiscesCache
that the bounding box is smaller than what Renderer is given. However, now
that we keep the bucket list, I think it would be more efficient if we
got rid if EdgeM[in|ax]Y and simply computed the y bounds by looking at the
head and tail of the bucket list.


That makes sense.  We calculate that per-edge anyway so the edgeMy 
constants are redundant.



Also, perhaps we can keep track of edgeM[in|ax]X using the bounding boxes
of curves, instead of the lines in the flattened curves. This would not
be accurate, but I don't think it would affect rendering. It would simply
result in a few more alpha boxes than necessary. I don't think these would
be too bad, because mostly they're just going to be all 0 so they will
be skipped because getTypicalAlpha() is now implemented.
How do you think we should handle these 4 variables?


I think this is probably OK, but let me play with it a bit and see what 
I come up with.  For one thing, the extra "slop" may not be large enough 
to trigger a full tile of 0's - there would have to be 32-pixel borders 
for that to happen.


If we get rid of the redundant edgeMy calculations then we might be able 
to do edgeMx calculations on each edge without losing any performance...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-09 Thread Denis Lila
Hi again.

I just thought of this: if we're really concerned about the accuracy
of the edgeMinX edgeMaxX variables, we could find the curves' 
critical points and use them to compute the min/max X values. After all,
we're creating (or rather "setting") the Curve objects anyway. This isn't
as fast as using the bounding boxes, but it's close and much more accurate.

Regards,
Denis.

- "Denis Lila"  wrote:

> Hi Jim.
> 
> > All lines generated from a given "allegedly monotonic" curve are
> > recorded with the same "or" (orientation) value.  But, if the curves
> > are not truly monotonic then it might be theoretically possible to
> > generate a line that is backwards with respect to the expected
> orientation.  It
> > would then get recorded in the edges array with the wrong
> orientation
> > and slope and then rasterization might unravel.
> 
> I see. In that case, I think it's a good idea if we don't make curves
> "monotonic". I already did this, by moving the edgeMin/axX/Y handling
> and orientation computations in addLine. This did make it slower
> compared
> to the file you sent me, but only by very, very little. Curves were
> affected the most, and they were only 1% slower. I think we can handle
> this, especially since lines were about 1% faster. The code is also 70
> lines shorter.
> 
> The edgeM* members are used only so we don't have to iterate through
> every
> scanline if this is not necessary, and so that we can tell PiscesCache
> that the bounding box is smaller than what Renderer is given. However,
> now
> that we keep the bucket list, I think it would be more efficient if we
> got rid if EdgeM[in|ax]Y and simply computed the y bounds by looking
> at the
> head and tail of the bucket list.
> Also, perhaps we can keep track of edgeM[in|ax]X using the bounding
> boxes
> of curves, instead of the lines in the flattened curves. This would
> not
> be accurate, but I don't think it would affect rendering. It would
> simply
> result in a few more alpha boxes than necessary. I don't think these
> would
> be too bad, because mostly they're just going to be all 0 so they will
> be skipped because getTypicalAlpha() is now implemented.
> How do you think we should handle these 4 variables?
> 
> Thank you,
> Denis.
> 
> - "Jim Graham"  wrote:
> 
> > Hi Denis,
> >
> > On 11/8/2010 2:39 PM, Denis Lila wrote:
> > >> Finally, I discovered (while testing for other problems) that the
> > >> curves are not truly monotonic after slicing them.  I realized
> this
> > years ago
> > >> when I was writing my Area code (see sun.awt.geom.Curve) and put
> > in
> > >> tweaking code to make them monotonic after they were split.  They
> > are
> > >> never off by more than a few bits, but you can't trust the curve
> > >> splitting math to generate purely monotonic segments based on a t
> > >> generated by some unrelated math.  Sometimes the truly horizontal
> > or
> > >> vertical t value requires more precision than a float (or even a
> > >> double) can provide and splitting at the highest representable
> > float less than
> > >> the t value produces a pair of curves on one side of the hill and
> > >> splitting at the next float value (which is greater than the true
> > t
> > >> value) produces curves on the other side of the hill.  Also, when
> > the
> > >> curve has been split a few times already, the t values loose
> > accuracy
> > >> with each split.  This will all be moot if I can eliminate the
> > >> splitting code from the renderer, but it may also play a factor
> in
> > the
> > >> stroke/dash
> > >> code...
> > >
> > > Making curves monotonic is only used for optimization purposes,
> > > so it can't see how it would affect rendering correctness.
> >
> > Fortunately, the non-monotonicity is limited to a few bits of
> > precision
> > so this may never generate an errant edge in practice unless
> > flattening
> > gets really fine-grained...
> >
> > ...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-09 Thread Denis Lila
Hi Jim.

> All lines generated from a given "allegedly monotonic" curve are 
> recorded with the same "or" (orientation) value.  But, if the curves
> are not truly monotonic then it might be theoretically possible to
> generate a line that is backwards with respect to the expected orientation.  
> It
> would then get recorded in the edges array with the wrong orientation
> and slope and then rasterization might unravel.

I see. In that case, I think it's a good idea if we don't make curves
"monotonic". I already did this, by moving the edgeMin/axX/Y handling
and orientation computations in addLine. This did make it slower compared
to the file you sent me, but only by very, very little. Curves were
affected the most, and they were only 1% slower. I think we can handle
this, especially since lines were about 1% faster. The code is also 70
lines shorter.

The edgeM* members are used only so we don't have to iterate through every
scanline if this is not necessary, and so that we can tell PiscesCache
that the bounding box is smaller than what Renderer is given. However, now
that we keep the bucket list, I think it would be more efficient if we
got rid if EdgeM[in|ax]Y and simply computed the y bounds by looking at the
head and tail of the bucket list.
Also, perhaps we can keep track of edgeM[in|ax]X using the bounding boxes
of curves, instead of the lines in the flattened curves. This would not
be accurate, but I don't think it would affect rendering. It would simply
result in a few more alpha boxes than necessary. I don't think these would
be too bad, because mostly they're just going to be all 0 so they will
be skipped because getTypicalAlpha() is now implemented.
How do you think we should handle these 4 variables?

Thank you,
Denis.

- "Jim Graham"  wrote:

> Hi Denis,
> 
> On 11/8/2010 2:39 PM, Denis Lila wrote:
> >> Finally, I discovered (while testing for other problems) that the
> >> curves are not truly monotonic after slicing them.  I realized this
> years ago
> >> when I was writing my Area code (see sun.awt.geom.Curve) and put
> in
> >> tweaking code to make them monotonic after they were split.  They
> are
> >> never off by more than a few bits, but you can't trust the curve
> >> splitting math to generate purely monotonic segments based on a t
> >> generated by some unrelated math.  Sometimes the truly horizontal
> or
> >> vertical t value requires more precision than a float (or even a
> >> double) can provide and splitting at the highest representable
> float less than
> >> the t value produces a pair of curves on one side of the hill and
> >> splitting at the next float value (which is greater than the true
> t
> >> value) produces curves on the other side of the hill.  Also, when
> the
> >> curve has been split a few times already, the t values loose
> accuracy
> >> with each split.  This will all be moot if I can eliminate the
> >> splitting code from the renderer, but it may also play a factor in
> the
> >> stroke/dash
> >> code...
> >
> > Making curves monotonic is only used for optimization purposes,
> > so it can't see how it would affect rendering correctness.
> 
> Fortunately, the non-monotonicity is limited to a few bits of
> precision 
> so this may never generate an errant edge in practice unless
> flattening 
> gets really fine-grained...
> 
>   ...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-09 Thread Denis Lila
Hi Jim.

> The problem is that pruning complicates the inner loop that advances
> the scanline and you can't tell without scanning every segment in play 
> whether you need to do it in its own loop.  Thus, for one of my test 
> cases it was really expensive without some up front record of whether
> or not the pruning was needed.

Ah, I see. I misunderstood this yesterday. I read the line as
...[firstCrossing - boundsMinY] |= 1 instead of ...[lastCrossing-boundsMinY] |= 
1.
My bad.

Regards,
Denis.

- "Jim Graham"  wrote:

> Hi Denis,
> 
> Also, I keep a count so I can amortize the "widenArray" call.  Before
> I 
> was calling "widen(..., 1)" inside the loop that drew new edges in
> from 
> the bucket.  I could also have added an additional loop to count the
> new 
> edges, but that would have been expensive as well...
> 
>   ...jim
> 
> On 11/8/2010 3:23 PM, Denis Lila wrote:
> > Hi Jim.
> >
> > I like the new Renderer, but I have a question about
> edgeBucketCount.
> >
> > As far as I can see it is only used so that we can tell whether
> > any given bucket contains only edges that go all the way to the
> last
> > (or beyond) scanline. Is this really common enough that we gain by
> > treating it as a special case? Would it not be better to assume
> that
> > every bucket needs pruning and save some memory (and possibly
> resulting
> > in better cache performance)?
> >
> > Thanks,
> > Denis.
> >
> > - "Jim Graham"  wrote:
> >
> >> It's still a work in progress, but I've cleaned up a lot of logic
> and
> >>
> >> made it faster in a number of ways.  Note that I've abstracted out
> the
> >>
> >> cache stuff and created an "AlphaConsumer" interface which may
> evolve
> >>
> >> over time.
> >>
> >> In FX we actually consume alphas in larger chunks than the code in
> JDK
> >>
> >> which was driven by Ductus's 32x32 mandate, so I would have had to
> >> make
> >> completely incompatible changes to emitRow - so I moved it behind
> an
> >> interface.  For the JDK code, if you want to integrate this
> version, I
> >>
> >> would have the cache implement the new interface and move your
> version
> >>
> >> of emitRow into the Cache object.  I'd send you the new code for
> my
> >> AlphaConsumer, but it is incompatible with what you need to cache
> so
> >> it
> >> won't help you.
> >>
> >> You'll also need a bit of un-translation cleanup as we have mirrors
> of
> >>
> >> all of java.awt.geom with special new APIs that FX needs.
> >>
> >>...jim
> >>
> >> On 11/8/2010 6:40 AM, Denis Lila wrote:
> >>> Hi Jim.
> >>>
>  Also, I've gotten another 20% improvement out of the design with
> a
> >> few
>  more tweaks.  (Though I measured the 20% in the stripped down
> >> version
>  that I'm prototyping with FX so I'm not sure how much of that
> 20%
>  would show up through the layers of the 2D code.  Overall, I've
> >> about
>  doubled the frame rates of the prototype since your first drop
> that
> >> you
>  checked in to the OpenJDK repository.)
> >>>
> >>> Can I see your new version?
> >>
> >> Attached.
> >>
>  How about looking more at the stroking end of the process and
> I'll
> >> dig
>  a little more into optimal rasterization code.  I have a lot of
>  experience with optimizing rasterizer code (and JNI if it comes
> to
> >> that), but
>  very little with the curve manipulations involved in stroking
> (math
> >> is so
>  *hard* at my age ;-)...
> >>>
> >>> Sounds good. Have you implemented your idea of processing one
> pixel
> >> row at a
> >>> time, as opposed to processing subpixel rows? If not, I could do
> >> that.
> >>
> >> Not yet.  Right now I've gotten a lot of mileage out of a few
> tweaks
> >> of
> >> the bookkeeping of the sample-row-at-a-time version.  I'm still
> >> mulling
> >> over exactly how to make that go faster.
> >>
> >>...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-08 Thread Jim Graham

Hi Denis,

The problem is that pruning complicates the inner loop that advances the 
scanline and you can't tell without scanning every segment in play 
whether you need to do it in its own loop.  Thus, for one of my test 
cases it was really expensive without some up front record of whether or 
not the pruning was needed.


Also, I keep a count so I can amortize the "widenArray" call.  Before I 
was calling "widen(..., 1)" inside the loop that drew new edges in from 
the bucket.  I could also have added an additional loop to count the new 
edges, but that would have been expensive as well...


...jim

On 11/8/2010 3:23 PM, Denis Lila wrote:

Hi Jim.

I like the new Renderer, but I have a question about edgeBucketCount.

As far as I can see it is only used so that we can tell whether
any given bucket contains only edges that go all the way to the last
(or beyond) scanline. Is this really common enough that we gain by
treating it as a special case? Would it not be better to assume that
every bucket needs pruning and save some memory (and possibly resulting
in better cache performance)?

Thanks,
Denis.

- "Jim Graham"  wrote:


It's still a work in progress, but I've cleaned up a lot of logic and

made it faster in a number of ways.  Note that I've abstracted out the

cache stuff and created an "AlphaConsumer" interface which may evolve

over time.

In FX we actually consume alphas in larger chunks than the code in JDK

which was driven by Ductus's 32x32 mandate, so I would have had to
make
completely incompatible changes to emitRow - so I moved it behind an
interface.  For the JDK code, if you want to integrate this version, I

would have the cache implement the new interface and move your version

of emitRow into the Cache object.  I'd send you the new code for my
AlphaConsumer, but it is incompatible with what you need to cache so
it
won't help you.

You'll also need a bit of un-translation cleanup as we have mirrors of

all of java.awt.geom with special new APIs that FX needs.

...jim

On 11/8/2010 6:40 AM, Denis Lila wrote:

Hi Jim.


Also, I've gotten another 20% improvement out of the design with a

few

more tweaks.  (Though I measured the 20% in the stripped down

version

that I'm prototyping with FX so I'm not sure how much of that 20%
would show up through the layers of the 2D code.  Overall, I've

about

doubled the frame rates of the prototype since your first drop that

you

checked in to the OpenJDK repository.)


Can I see your new version?


Attached.


How about looking more at the stroking end of the process and I'll

dig

a little more into optimal rasterization code.  I have a lot of
experience with optimizing rasterizer code (and JNI if it comes to

that), but

very little with the curve manipulations involved in stroking (math

is so

*hard* at my age ;-)...


Sounds good. Have you implemented your idea of processing one pixel

row at a

time, as opposed to processing subpixel rows? If not, I could do

that.

Not yet.  Right now I've gotten a lot of mileage out of a few tweaks
of
the bookkeeping of the sample-row-at-a-time version.  I'm still
mulling
over exactly how to make that go faster.

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-08 Thread Jim Graham

Hi Denis,

On 11/8/2010 2:39 PM, Denis Lila wrote:

Finally, I discovered (while testing for other problems) that the
curves are not truly monotonic after slicing them.  I realized this years ago
when I was writing my Area code (see sun.awt.geom.Curve) and put in
tweaking code to make them monotonic after they were split.  They are
never off by more than a few bits, but you can't trust the curve
splitting math to generate purely monotonic segments based on a t
generated by some unrelated math.  Sometimes the truly horizontal or
vertical t value requires more precision than a float (or even a
double) can provide and splitting at the highest representable float less than
the t value produces a pair of curves on one side of the hill and
splitting at the next float value (which is greater than the true t
value) produces curves on the other side of the hill.  Also, when the
curve has been split a few times already, the t values loose accuracy
with each split.  This will all be moot if I can eliminate the
splitting code from the renderer, but it may also play a factor in the
stroke/dash
code...


Making curves monotonic is only used for optimization purposes,
so it can't see how it would affect rendering correctness.


All lines generated from a given "allegedly monotonic" curve are 
recorded with the same "or" (orientation) value.  But, if the curves are 
not truly monotonic then it might be theoretically possible to generate 
a line that is backwards with respect to the expected orientation.  It 
would then get recorded in the edges array with the wrong orientation 
and slope and then rasterization might unravel.


Fortunately, the non-monotonicity is limited to a few bits of precision 
so this may never generate an errant edge in practice unless flattening 
gets really fine-grained...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-08 Thread Denis Lila
Hi Jim.

I like the new Renderer, but I have a question about edgeBucketCount.

As far as I can see it is only used so that we can tell whether
any given bucket contains only edges that go all the way to the last
(or beyond) scanline. Is this really common enough that we gain by
treating it as a special case? Would it not be better to assume that
every bucket needs pruning and save some memory (and possibly resulting
in better cache performance)?

Thanks,
Denis.

- "Jim Graham"  wrote:

> It's still a work in progress, but I've cleaned up a lot of logic and
> 
> made it faster in a number of ways.  Note that I've abstracted out the
> 
> cache stuff and created an "AlphaConsumer" interface which may evolve
> 
> over time.
> 
> In FX we actually consume alphas in larger chunks than the code in JDK
> 
> which was driven by Ductus's 32x32 mandate, so I would have had to
> make 
> completely incompatible changes to emitRow - so I moved it behind an 
> interface.  For the JDK code, if you want to integrate this version, I
> 
> would have the cache implement the new interface and move your version
> 
> of emitRow into the Cache object.  I'd send you the new code for my 
> AlphaConsumer, but it is incompatible with what you need to cache so
> it 
> won't help you.
> 
> You'll also need a bit of un-translation cleanup as we have mirrors of
> 
> all of java.awt.geom with special new APIs that FX needs.
> 
>   ...jim
> 
> On 11/8/2010 6:40 AM, Denis Lila wrote:
> > Hi Jim.
> >
> >> Also, I've gotten another 20% improvement out of the design with a
> few
> >> more tweaks.  (Though I measured the 20% in the stripped down
> version
> >> that I'm prototyping with FX so I'm not sure how much of that 20%
> >> would show up through the layers of the 2D code.  Overall, I've
> about
> >> doubled the frame rates of the prototype since your first drop that
> you
> >> checked in to the OpenJDK repository.)
> >
> > Can I see your new version?
> 
> Attached.
> 
> >> How about looking more at the stroking end of the process and I'll
> dig
> >> a little more into optimal rasterization code.  I have a lot of
> >> experience with optimizing rasterizer code (and JNI if it comes to
> that), but
> >> very little with the curve manipulations involved in stroking (math
> is so
> >> *hard* at my age ;-)...
> >
> > Sounds good. Have you implemented your idea of processing one pixel
> row at a
> > time, as opposed to processing subpixel rows? If not, I could do
> that.
> 
> Not yet.  Right now I've gotten a lot of mileage out of a few tweaks
> of 
> the bookkeeping of the sample-row-at-a-time version.  I'm still
> mulling 
> over exactly how to make that go faster.
> 
>   ...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-08 Thread Denis Lila
Hi Jim.

> A couple of questions about the code that I haven't touched...
> Is there some reason why the AFD for cubics doesn't have any tests for
> dddxy (the "constants" for its equations), but the AFD for quads is 
> testing the ddxy on every loop?  I know that these values do change
> when the AFD variables are doubled or halved, but why does the cubic
> version get away with only testing out to the n-1th order differences but the
> quad version has to test out to the nth order differences?

That's because I used to have only one AFD function that worked for both
cubics and quadratics. It simply treated quadratics as cubics with the
first coefficient 0. I decided to split them up because we might want
to do different things for quadratics, but I didn't get around to doing
any of them (yet) so that's why it is the way it is.

> Also, what is the value of breaking the pieces into monotonic segments
> prior to flattening?  Is it simply amortizing the cost of determining
> if the segment is up or down?  I guess this used to be done because we 
> needed monotonic (in Y) curve segments since we did a top-down
> iteration across all segments, but now they aren't in the rasterization loop. 
> If it is simply a performance issue then I may experiment with
> eliminating that stage and seeing if I can make it go faster overall.

Well, making them monotonic makes updating edgeM[in|ax][X|Y] more efficient.
We only have to do the tests once per subdivided curve instead of once per
line. Also it helps with making our lines have y2>y1. Without monotonicity
we'd have to test each individual line, instead of just reversing the 
curve coordinate array. However, for the latter, all we need is monotonicity
in Y. We should make some measurements to determine whether the above
are enough to compensate for the subdivision costs.

> Finally, I discovered (while testing for other problems) that the
> curves are not truly monotonic after slicing them.  I realized this years ago
> when I was writing my Area code (see sun.awt.geom.Curve) and put in 
> tweaking code to make them monotonic after they were split.  They are
> never off by more than a few bits, but you can't trust the curve 
> splitting math to generate purely monotonic segments based on a t 
> generated by some unrelated math.  Sometimes the truly horizontal or 
> vertical t value requires more precision than a float (or even a
> double) can provide and splitting at the highest representable float less than
> the t value produces a pair of curves on one side of the hill and 
> splitting at the next float value (which is greater than the true t 
> value) produces curves on the other side of the hill.  Also, when the
> curve has been split a few times already, the t values loose accuracy
> with each split.  This will all be moot if I can eliminate the
> splitting code from the renderer, but it may also play a factor in the
> stroke/dash 
> code...

Making curves monotonic is only used for optimization purposes,
so it can't see how it would affect rendering correctness.

Regards,
Denis.

- "Jim Graham"  wrote:

> 
>   ...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-08 Thread Jim Graham
Also, some things to note in the new version - things I had to "fix" not 
related to performance.


In endRendering the pixel bounds of the edges needed to be computed and 
passed along to the next stage.  Unfortunately the code in endRendering 
computed the sub-pixel bounds and then simply shifted them to get the 
pixel addresses of those bounds.  For the bottom and right edges, this 
means that a ceiling operation was performed on the sub-pixel data, but 
a floor operation was performed on the conversion to pixel addresses. 
This means that the bottom few sub-pixel rows could have been sliced off 
in these calculations.  I restructured the calculations there to make 
sure that it produced the pixel bounds of all of the subpixel rows. 
Also, I use a ceil() operation on the sub-pixel values because we really 
want to know what is the next sub-pixel sample row/column that we will 
cross and then get the (pixel) bounds of those sample row/columns.


I think there may have been a couple of other places where the "which 
pixel are we on" code got sloppy along similar lines to the endRendering 
case.


The conditions for sending the alphas for the last row changed a little. 
 I need to have every row visited in my AlphaConsumer because it is 
cleaning out an alpha tile as it incorporates the new alphas.  Thus, I 
needed the logic to call "getAndClear" even if there were no alphas to 
deliver.  I don't think that change would have affected your version 
with the PiscesCache since you can just finalize the cache when the 
alphas are done and clearing the alphaRow[] array is irrelevant at that 
final stage.  The way this works may change as I continue optimizing.


...jim

On 11/8/2010 6:40 AM, Denis Lila wrote:

Hi Jim.


Also, I've gotten another 20% improvement out of the design with a few
more tweaks.  (Though I measured the 20% in the stripped down version
that I'm prototyping with FX so I'm not sure how much of that 20%
would show up through the layers of the 2D code.  Overall, I've about
doubled the frame rates of the prototype since your first drop that you
checked in to the OpenJDK repository.)


Can I see your new version?


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-08 Thread Jim Graham

A couple of questions about the code that I haven't touched...

Is there some reason why the AFD for cubics doesn't have any tests for 
dddxy (the "constants" for its equations), but the AFD for quads is 
testing the ddxy on every loop?  I know that these values do change when 
the AFD variables are doubled or halved, but why does the cubic version 
get away with only testing out to the n-1th order differences but the 
quad version has to test out to the nth order differences?


Also, what is the value of breaking the pieces into monotonic segments 
prior to flattening?  Is it simply amortizing the cost of determining if 
the segment is up or down?  I guess this used to be done because we 
needed monotonic (in Y) curve segments since we did a top-down iteration 
across all segments, but now they aren't in the rasterization loop.  If 
it is simply a performance issue then I may experiment with eliminating 
that stage and seeing if I can make it go faster overall.


Finally, I discovered (while testing for other problems) that the curves 
are not truly monotonic after slicing them.  I realized this years ago 
when I was writing my Area code (see sun.awt.geom.Curve) and put in 
tweaking code to make them monotonic after they were split.  They are 
never off by more than a few bits, but you can't trust the curve 
splitting math to generate purely monotonic segments based on a t 
generated by some unrelated math.  Sometimes the truly horizontal or 
vertical t value requires more precision than a float (or even a double) 
can provide and splitting at the highest representable float less than 
the t value produces a pair of curves on one side of the hill and 
splitting at the next float value (which is greater than the true t 
value) produces curves on the other side of the hill.  Also, when the 
curve has been split a few times already, the t values loose accuracy 
with each split.  This will all be moot if I can eliminate the splitting 
code from the renderer, but it may also play a factor in the stroke/dash 
code...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-08 Thread Jim Graham
It's still a work in progress, but I've cleaned up a lot of logic and 
made it faster in a number of ways.  Note that I've abstracted out the 
cache stuff and created an "AlphaConsumer" interface which may evolve 
over time.


In FX we actually consume alphas in larger chunks than the code in JDK 
which was driven by Ductus's 32x32 mandate, so I would have had to make 
completely incompatible changes to emitRow - so I moved it behind an 
interface.  For the JDK code, if you want to integrate this version, I 
would have the cache implement the new interface and move your version 
of emitRow into the Cache object.  I'd send you the new code for my 
AlphaConsumer, but it is incompatible with what you need to cache so it 
won't help you.


You'll also need a bit of un-translation cleanup as we have mirrors of 
all of java.awt.geom with special new APIs that FX needs.


...jim

On 11/8/2010 6:40 AM, Denis Lila wrote:

Hi Jim.


Also, I've gotten another 20% improvement out of the design with a few
more tweaks.  (Though I measured the 20% in the stripped down version
that I'm prototyping with FX so I'm not sure how much of that 20%
would show up through the layers of the 2D code.  Overall, I've about
doubled the frame rates of the prototype since your first drop that you
checked in to the OpenJDK repository.)


Can I see your new version?


Attached.


How about looking more at the stroking end of the process and I'll dig
a little more into optimal rasterization code.  I have a lot of
experience with optimizing rasterizer code (and JNI if it comes to that), but
very little with the curve manipulations involved in stroking (math is so
*hard* at my age ;-)...


Sounds good. Have you implemented your idea of processing one pixel row at a
time, as opposed to processing subpixel rows? If not, I could do that.


Not yet.  Right now I've gotten a lot of mileage out of a few tweaks of 
the bookkeeping of the sample-row-at-a-time version.  I'm still mulling 
over exactly how to make that go faster.


...jim
package com.sun.openpisces;

/**
 * @author Flar
 */
public interface AlphaConsumer {
public int getOriginX();
public int getOriginY();
public int getWidth();
public int getHeight();
public void setMaxAlpha(int maxalpha);
public void setAndClearRelativeAlphas(int alphaDeltas[], int y,
  int firstdelta, int lastdelta);
}
package com.sun.openpisces;

import com.sun.javafx.geom.PathConsumer;
import com.sun.javafx.geom.PathIterator;
import com.sun.javafx.geom.Rectangle;
import java.util.Iterator;

/**
 * @author Flar
 */
public final class OpenPiscesRenderer implements PathConsumer {

public static void feedConsumer(PathIterator pi, PathConsumer pc) {
float[] coords = new float[6];
while (!pi.isDone()) {
int type = pi.currentSegment(coords);
switch (type) {
case PathIterator.SEG_MOVETO:
pc.moveTo(coords[0], coords[1]);
break;
case PathIterator.SEG_LINETO:
pc.lineTo(coords[0], coords[1]);
break;
case PathIterator.SEG_QUADTO:
pc.quadTo(coords[0], coords[1],
  coords[2], coords[3]);
break;
case PathIterator.SEG_CUBICTO:
pc.curveTo(coords[0], coords[1],
   coords[2], coords[3],
   coords[4], coords[5]);
break;
case PathIterator.SEG_CLOSE:
pc.closePath();
break;
}
pi.next();
}
pc.pathDone();
}

private final class ScanlineIterator {

private int[] crossings;
private int[] edgePtrs;
private int edgeCount;

// crossing bounds. The bounds are not necessarily tight (the scan line
// at minY, for example, might have no crossings). The x bounds will
// be accumulated as crossings are computed.
private final int maxY;
private int nextY;

private static final int INIT_CROSSINGS_SIZE = 10;

private ScanlineIterator() {
crossings = new int[INIT_CROSSINGS_SIZE];
edgePtrs = new int[INIT_CROSSINGS_SIZE];

// We don't care if we clip some of the line off with ceil, since
// no scan line crossings will be eliminated (in fact, the ceil is
// the y of the first scan line crossing).
nextY = getFirstScanLineCrossing();
maxY = getScanLineCrossingEnd()-1;
}

private int next() {
// TODO: make function that convert from y value to bucket idx?
int cury = nextY++;
int bucket = cury - boundsMinY;
int count = this.edgeCount;
int ptrs[] = this.edgePtrs;

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-08 Thread Jim Graham

On 11/8/2010 6:34 AM, Denis Lila wrote:

Hi Clemens.


I've only followed your discussion with Jim but skipped all the
in-depth discussion.
 From my prior experiences usually  JNI is not woth the trouble, if
you don't have a serious reason why using native code would have
advantages (like the possibility of using SIMD or when value-types
would be a huge benefit), it has its own performance pitfalls
especially if the workload is small and things like Get*ArrayCritical
cause scalability problems because they have to lock the GC.


Well, Jim Graham said that a native version of the engine still runs
a lot faster than the version with all my changes. That's why I thought


Actually, that report is old.  I've now got the new Java version turning 
in double the frame rates of the old native version.



it would be a good idea. Also, when not doing antialiasing we usually
feed paths to native consumers, so I thought if pisces used JNI, we
could reduce the java->C transitions five fold. But then I realized that
with antialiasing the opposite would happen, so I'm not sure whether
JNI is a good idea.


That's a good point that the other rasterizers will end up using this 
stroking engine and they are native.  We can worry about cleaning that 
up later.  JNI might eventually be a good idea, but lets fix the 
algorithm first and then worry about whether it will help this renderer 
or if we can make the interface to the other renderers simpler.


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-08 Thread Mario Torre
Il giorno lun, 08/11/2010 alle 09.34 -0500, Denis Lila ha scritto:

> > Have you had a look at the Netbeans profiler? It supports sampling
> > based testing to keep the influence of the profiler at a minimum.
> 
> No, I don't use netbeans - it doesn't render underscores properly (although
> I think this is an openjdk bug). I'll try it out.

This has been fixed in OpenJDK 7, and should have been back ported to 6,
not sure when the fix will be out.

> > Thanks for working on this!

yup :)

Mario
-- 
pgp key: http://subkeys.pgp.net/ PGP Key ID: 80F240CF
Fingerprint: BA39 9666 94EC 8B73 27FA  FC7C 4086 63E3 80F2 40CF

Proud GNU Classpath developer: http://www.classpath.org/
Read About us at: http://planet.classpath.org
OpenJDK: http://openjdk.java.net/projects/caciocavallo/

Please, support open standards:
http://endsoftpatents.org/



Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-08 Thread Denis Lila
Hi Clemens.

> I've only followed your discussion with Jim but skipped all the
> in-depth discussion.
> From my prior experiences usually  JNI is not woth the trouble, if
> you don't have a serious reason why using native code would have
> advantages (like the possibility of using SIMD or when value-types
> would be a huge benefit), it has its own performance pitfalls
> especially if the workload is small and things like Get*ArrayCritical
> cause scalability problems because they have to lock the GC.

Well, Jim Graham said that a native version of the engine still runs
a lot faster than the version with all my changes. That's why I thought
it would be a good idea. Also, when not doing antialiasing we usually
feed paths to native consumers, so I thought if pisces used JNI, we
could reduce the java->C transitions five fold. But then I realized that
with antialiasing the opposite would happen, so I'm not sure whether
JNI is a good idea.

> Have you had a look at the Netbeans profiler? It supports sampling
> based testing to keep the influence of the profiler at a minimum.

No, I don't use netbeans - it doesn't render underscores properly (although
I think this is an openjdk bug). I'll try it out.

> Thanks for working on this!

It's been fun.

Regards,
Denis.

- "Clemens Eisserer"  wrote:

> Hi Denis,
> 
> > It's not obvious to me why this happened, so I think now I will put
> > this type of optimization aside and convert to JNI,
> 

> 
> > where profiling
> > will be easier (for me - I haven't been able to make OProfile work
> > for java yet).
> 
> - Clemens


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-08 Thread Denis Lila
Hi Jim.

> Also, I've gotten another 20% improvement out of the design with a few
> more tweaks.  (Though I measured the 20% in the stripped down version
> that I'm prototyping with FX so I'm not sure how much of that 20%
> would show up through the layers of the 2D code.  Overall, I've about
> doubled the frame rates of the prototype since your first drop that you
> checked in to the OpenJDK repository.)

Can I see your new version?

> How about looking more at the stroking end of the process and I'll dig
> a little more into optimal rasterization code.  I have a lot of
> experience with optimizing rasterizer code (and JNI if it comes to that), but
> very little with the curve manipulations involved in stroking (math is so 
> *hard* at my age ;-)...

Sounds good. Have you implemented your idea of processing one pixel row at a
time, as opposed to processing subpixel rows? If not, I could do that.

Thanks,
Denis.

- "Jim Graham"  wrote:

> Hi Denis,
> 
>   ...jim
> 
> On 11/4/2010 9:20 AM, Clemens Eisserer wrote:
> > Hi Denis,
> >
> >> It's not obvious to me why this happened, so I think now I will
> put
> >> this type of optimization aside and convert to JNI,
> >
> > I've only followed your discussion with Jim but skipped all the
> > in-depth discussion.
> >> From my prior experiences usually  JNI is not woth the trouble, if
> you
> > don't have a serious reason why using native code would have
> > advantages (like the possibility of using SIMD or when value-types
> > would be a huge benefit), it has its own performance pitfalls
> > especially if the workload is small and things like
> Get*ArrayCritical
> > cause scalability problems because they have to lock the GC.
> >
> >> where profiling
> >> will be easier (for me - I haven't been able to make OProfile work
> >> for java yet).
> >
> > Have you had a look at the Netbeans profiler? It supports sampling
> > based testing to keep the influence of the profiler at a minimum.
> >
> > Thanks for working on this!
> >
> > - Clemens


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-04 Thread Jim Graham

Hi Denis,

Also, I've gotten another 20% improvement out of the design with a few 
more tweaks.  (Though I measured the 20% in the stripped down version 
that I'm prototyping with FX so I'm not sure how much of that 20% would 
show up through the layers of the 2D code.  Overall, I've about doubled 
the frame rates of the prototype since your first drop that you checked 
in to the OpenJDK repository.)


How about looking more at the stroking end of the process and I'll dig a 
little more into optimal rasterization code.  I have a lot of experience 
with optimizing rasterizer code (and JNI if it comes to that), but very 
little with the curve manipulations involved in stroking (math is so 
*hard* at my age ;-)...


...jim

On 11/4/2010 9:20 AM, Clemens Eisserer wrote:

Hi Denis,


It's not obvious to me why this happened, so I think now I will put
this type of optimization aside and convert to JNI,


I've only followed your discussion with Jim but skipped all the
in-depth discussion.

From my prior experiences usually  JNI is not woth the trouble, if you

don't have a serious reason why using native code would have
advantages (like the possibility of using SIMD or when value-types
would be a huge benefit), it has its own performance pitfalls
especially if the workload is small and things like Get*ArrayCritical
cause scalability problems because they have to lock the GC.


where profiling
will be easier (for me - I haven't been able to make OProfile work
for java yet).


Have you had a look at the Netbeans profiler? It supports sampling
based testing to keep the influence of the profiler at a minimum.

Thanks for working on this!

- Clemens


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-04 Thread Clemens Eisserer
Hi Denis,

> It's not obvious to me why this happened, so I think now I will put
> this type of optimization aside and convert to JNI,

I've only followed your discussion with Jim but skipped all the
in-depth discussion.
>From my prior experiences usually  JNI is not woth the trouble, if you
don't have a serious reason why using native code would have
advantages (like the possibility of using SIMD or when value-types
would be a huge benefit), it has its own performance pitfalls
especially if the workload is small and things like Get*ArrayCritical
cause scalability problems because they have to lock the GC.

> where profiling
> will be easier (for me - I haven't been able to make OProfile work
> for java yet).

Have you had a look at the Netbeans profiler? It supports sampling
based testing to keep the influence of the profiler at a minimum.

Thanks for working on this!

- Clemens


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-02 Thread Jim Graham

Hi Denis,

I had a bit of luck with the following next() method:

private int next() {
// TODO: make function that convert from y value to bucket idx?
int bucket = nextY - boundsMinY;
for (int ecur = edgeBuckets[bucket]; ecur != NULL; ecur = 
(int)edges[ecur+NEXT]) {

edgePtrs = LilaHelpers.widenArray(edgePtrs, edgeCount, 1);
edgePtrs[edgeCount++] = ecur;
// REMIND: Adjust start Y if necessary
}
int crossingCount = edgeCount;
crossings = LilaHelpers.widenArray(crossings, 0, 
crossingCount);

nextY++;
for (int i = 0; i < edgeCount; i++) {
int ecur = edgePtrs[i];
float curx = edges[ecur+CURX];
int cross = ((int) curx) << 1;
edges[ecur+CURX] = curx + edges[ecur+SLOPE];
if (edges[ecur+OR] > 0) {
cross |= 1;
}
int j = i;
while (--j >= 0) {
int jcross = crossings[j];
if (jcross <= cross) {
break;
}
crossings[j+1] = jcross;
edgePtrs[j+1] = edgePtrs[j];
}
crossings[j+1] = cross;
edgePtrs[j+1] = ecur;
}
int newCount = 0;
for (int i = 0; i < edgeCount; i++) {
int ecur = edgePtrs[i];
if (edges[ecur+YMAX] > nextY) {
edgePtrs[newCount++] = ecur;
}
}
edgeCount = newCount;
return crossingCount;
}

This allowed me to:

- delete a lot of the bucket helper functions.
- get rid of the back pointers
- pare an edge down to 5 values (YMAX, CURX, OR, SLOPE, and NEXT)

I also used the following addLine() method:

private void addLine(float x1, float y1, float x2, float y2, int or) {
final int firstCrossing = (int)Math.max(Math.ceil(y1), boundsMinY);
if (firstCrossing >= boundsMaxY) {
return;
}
final int ptr = numEdges * SIZEOF_EDGE;
final float slope = (x2 - x1) / (y2 - y1);
edges = LilaHelpers.widenArray(edges, ptr, SIZEOF_EDGE);
numEdges++;
edges[ptr+OR] = or;
edges[ptr+CURX] = x1 + (firstCrossing - y1) * slope;
edges[ptr+SLOPE] = slope;
edges[ptr+YMAX] = y2;
final int bucketIdx = firstCrossing - boundsMinY;
addEdgeToBucket(ptr, bucketIdx);
}

How does that fare for you?

...jim

On 11/2/2010 4:10 PM, Denis Lila wrote:

Hi Jim.

I implemented a middle ground between what I sent yesterday and
what we have now: using the array of linked lists only to replace
the quicksort (I think this was your original suggestion).

Unfortunately, this didn't work out (at least according to the
benchmarks). Curves were no different than what we used to have,
while the performance on lines (especially horizontal ones) decreased
significantly.

It's not obvious to me why this happened, so I think now I will put
this type of optimization aside and convert to JNI, where profiling
will be easier (for me - I haven't been able to make OProfile work
for java yet).

Regards,
Denis.

- "Jim Graham"  wrote:


Hi Denis,

A generic suggestion - make all of your classes final - that gives the

compiler the maximum flexibility to inline any methods you write.

With respect to the algorithm choices:

I think they key is that the X sorting rarely has any work to do.  The

first test of "does this edge need to be swapped with the next lower
edge" is probably 99.999% guaranteed to be false.  Thus, trying to
optimize anything else to simplify swapping is likely a step in the
wrong direction.

The casting may be hurting a bit more, and it is hurting on every
access
to an edge.

I'm guessing the best model to use would be to write the code to
assume
no swapping is necessary at all.  Get that code as simple and as fast
as
can be.  Then add as little perturbation of that code to manage
swapping
when it is necessary, and that will likely be the optimal
implementation.  I think you could probably even do some benchmarking
on
a path that is carefully tested to process lots of edges without any X

sorting and get that as fast as you can without any swap code, and
then
add the swap code that impacts the performance of that operation as
little as possible.  The key is that the swap code have as little
impact
on the performance of the case that never needs any swapping at all
first and foremost - then make swapping faster within that
constraint...

...jim

On 11/1/10 3:13 PM, Denis Lila wrote:

Hi Jim.

I implemented your bucket sort idea. I'm not just using the buckets
to remove the y-sort. I use them in the iteration through the

scanlines

too. What happens is that on any iteration, the acti

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-02 Thread Denis Lila
Hi Jim.

I implemented a middle ground between what I sent yesterday and
what we have now: using the array of linked lists only to replace
the quicksort (I think this was your original suggestion).

Unfortunately, this didn't work out (at least according to the
benchmarks). Curves were no different than what we used to have,
while the performance on lines (especially horizontal ones) decreased
significantly.

It's not obvious to me why this happened, so I think now I will put
this type of optimization aside and convert to JNI, where profiling
will be easier (for me - I haven't been able to make OProfile work
for java yet).

Regards,
Denis.

- "Jim Graham"  wrote:

> Hi Denis,
> 
> A generic suggestion - make all of your classes final - that gives the
> 
> compiler the maximum flexibility to inline any methods you write.
> 
> With respect to the algorithm choices:
> 
> I think they key is that the X sorting rarely has any work to do.  The
> 
> first test of "does this edge need to be swapped with the next lower 
> edge" is probably 99.999% guaranteed to be false.  Thus, trying to 
> optimize anything else to simplify swapping is likely a step in the 
> wrong direction.
> 
> The casting may be hurting a bit more, and it is hurting on every
> access 
> to an edge.
> 
> I'm guessing the best model to use would be to write the code to
> assume 
> no swapping is necessary at all.  Get that code as simple and as fast
> as 
> can be.  Then add as little perturbation of that code to manage
> swapping 
> when it is necessary, and that will likely be the optimal 
> implementation.  I think you could probably even do some benchmarking
> on 
> a path that is carefully tested to process lots of edges without any X
> 
> sorting and get that as fast as you can without any swap code, and
> then 
> add the swap code that impacts the performance of that operation as 
> little as possible.  The key is that the swap code have as little
> impact 
> on the performance of the case that never needs any swapping at all 
> first and foremost - then make swapping faster within that
> constraint...
> 
>   ...jim
> 
> On 11/1/10 3:13 PM, Denis Lila wrote:
> > Hi Jim.
> >
> > I implemented your bucket sort idea. I'm not just using the buckets
> > to remove the y-sort. I use them in the iteration through the
> scanlines
> > too. What happens is that on any iteration, the active list is the
> > doubly linked list buckets[nextY-boundsMinY]. I did this because I
> thought
> > less memory would need to be moved around compared to when we just
> kept
> > the active list "pointers" in an array. For example, with doubly
> linked
> > lists we can implement insertion sort with O(1) writes. With arrays
> we
> > have to use O(n) writes. This also allows us to get rid of the the
> edgePtrs
> > array.
> >
> > I ran some benchmarks, and unfortunately I was wrong, somehwere. All
> the tests
> > are at least 10% slower than the insertion sort version of what we
> have now.
> > I can't immediately see why this is. The only thing I can think of
> is that
> > there are a lot of float ->  int casts, but then again, I don't know
> how
> > slow this operation is. It would be good if it's because of the
> casts because
> > it would no longer be an issue when we convert to native.
> >
> > I alo made an unrelated change: I now find the orientation and
> update x0,y0
> > much earlier than we used to. The way I was doing it before was
> silly.
> >
> > Regards,
> > Denis.
> >
> > - "Jim Graham"  wrote:
> >
> >> Hi Denis,
> >>
> >> Good news!
> >>
> >> On 10/28/2010 3:27 PM, Denis Lila wrote:
>  If we moved to a Curve class or some other way to
>  consolidate the 3 lists (may be easier in native code), this
> might
> >> win
>  in more cases...
> >>>
> >>> Does that mean you no longer think we should flatten every curve
> as
> >> soon
> >>> as we get it?
> >>
> >> No, I was just discussion the feasibility of that one suggestion
> in
> >> the
> >> context of all of the possibilities of whether or not we took the
> >> other
> >> choices.  If you think that flattening will pay off then we don't
> have
> >>
> >> to worry about the 3 lists.  It was just that when I hacked it in,
> I
> >> still had your 3 lists and so the pros and cons of horizontal edge
> >> sorting were relative to that version of the renderer...
> >>
> >>...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-01 Thread Jim Graham

Hi Denis,

A generic suggestion - make all of your classes final - that gives the 
compiler the maximum flexibility to inline any methods you write.


With respect to the algorithm choices:

I think they key is that the X sorting rarely has any work to do.  The 
first test of "does this edge need to be swapped with the next lower 
edge" is probably 99.999% guaranteed to be false.  Thus, trying to 
optimize anything else to simplify swapping is likely a step in the 
wrong direction.


The casting may be hurting a bit more, and it is hurting on every access 
to an edge.


I'm guessing the best model to use would be to write the code to assume 
no swapping is necessary at all.  Get that code as simple and as fast as 
can be.  Then add as little perturbation of that code to manage swapping 
when it is necessary, and that will likely be the optimal 
implementation.  I think you could probably even do some benchmarking on 
a path that is carefully tested to process lots of edges without any X 
sorting and get that as fast as you can without any swap code, and then 
add the swap code that impacts the performance of that operation as 
little as possible.  The key is that the swap code have as little impact 
on the performance of the case that never needs any swapping at all 
first and foremost - then make swapping faster within that constraint...


...jim

On 11/1/10 3:13 PM, Denis Lila wrote:

Hi Jim.

I implemented your bucket sort idea. I'm not just using the buckets
to remove the y-sort. I use them in the iteration through the scanlines
too. What happens is that on any iteration, the active list is the
doubly linked list buckets[nextY-boundsMinY]. I did this because I thought
less memory would need to be moved around compared to when we just kept
the active list "pointers" in an array. For example, with doubly linked
lists we can implement insertion sort with O(1) writes. With arrays we
have to use O(n) writes. This also allows us to get rid of the the edgePtrs
array.

I ran some benchmarks, and unfortunately I was wrong, somehwere. All the tests
are at least 10% slower than the insertion sort version of what we have now.
I can't immediately see why this is. The only thing I can think of is that
there are a lot of float ->  int casts, but then again, I don't know how
slow this operation is. It would be good if it's because of the casts because
it would no longer be an issue when we convert to native.

I alo made an unrelated change: I now find the orientation and update x0,y0
much earlier than we used to. The way I was doing it before was silly.

Regards,
Denis.

- "Jim Graham"  wrote:


Hi Denis,

Good news!

On 10/28/2010 3:27 PM, Denis Lila wrote:

If we moved to a Curve class or some other way to
consolidate the 3 lists (may be easier in native code), this might

win

in more cases...


Does that mean you no longer think we should flatten every curve as

soon

as we get it?


No, I was just discussion the feasibility of that one suggestion in
the
context of all of the possibilities of whether or not we took the
other
choices.  If you think that flattening will pay off then we don't have

to worry about the 3 lists.  It was just that when I hacked it in, I
still had your 3 lists and so the pros and cons of horizontal edge
sorting were relative to that version of the renderer...

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-01 Thread Denis Lila
Hi Jim.

I implemented your bucket sort idea. I'm not just using the buckets
to remove the y-sort. I use them in the iteration through the scanlines
too. What happens is that on any iteration, the active list is the
doubly linked list buckets[nextY-boundsMinY]. I did this because I thought
less memory would need to be moved around compared to when we just kept
the active list "pointers" in an array. For example, with doubly linked
lists we can implement insertion sort with O(1) writes. With arrays we
have to use O(n) writes. This also allows us to get rid of the the edgePtrs
array.

I ran some benchmarks, and unfortunately I was wrong, somehwere. All the tests
are at least 10% slower than the insertion sort version of what we have now.
I can't immediately see why this is. The only thing I can think of is that
there are a lot of float -> int casts, but then again, I don't know how
slow this operation is. It would be good if it's because of the casts because
it would no longer be an issue when we convert to native.

I alo made an unrelated change: I now find the orientation and update x0,y0
much earlier than we used to. The way I was doing it before was silly.

Regards,
Denis.

- "Jim Graham"  wrote:

> Hi Denis,
> 
> Good news!
> 
> On 10/28/2010 3:27 PM, Denis Lila wrote:
> >> If we moved to a Curve class or some other way to
> >> consolidate the 3 lists (may be easier in native code), this might
> win
> >> in more cases...
> >
> > Does that mean you no longer think we should flatten every curve as
> soon
> > as we get it?
> 
> No, I was just discussion the feasibility of that one suggestion in
> the 
> context of all of the possibilities of whether or not we took the
> other 
> choices.  If you think that flattening will pay off then we don't have
> 
> to worry about the 3 lists.  It was just that when I hacked it in, I 
> still had your 3 lists and so the pros and cons of horizontal edge 
> sorting were relative to that version of the renderer...
> 
>   ...jim
/*
 * Copyright (c) 2007, Oracle and/or its affiliates. All rights reserved.
 * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
 *
 * This code is free software; you can redistribute it and/or modify it
 * under the terms of the GNU General Public License version 2 only, as
 * published by the Free Software Foundation.  Oracle designates this
 * particular file as subject to the "Classpath" exception as provided
 * by Oracle in the LICENSE file that accompanied this code.
 *
 * This code is distributed in the hope that it will be useful, but WITHOUT
 * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
 * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
 * version 2 for more details (a copy is included in the LICENSE file that
 * accompanied this code).
 *
 * You should have received a copy of the GNU General Public License version
 * 2 along with this work; if not, write to the Free Software Foundation,
 * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
 *
 * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
 * or visit www.oracle.com if you need additional information or have any
 * questions.
 */

package sun.java2d.pisces;

import java.util.Iterator;

import sun.awt.geom.PathConsumer2D;

public class Renderer implements PathConsumer2D {

private class ScanlineIterator {

private int[] crossings;

// crossing bounds. The bounds are not necessarily tight (the scan line
// at minY, for example, might have no crossings). The x bounds will
// be accumulated as crossings are computed.
private final int maxY;
private int nextY;

private static final int INIT_CROSSINGS_SIZE = 10;

private ScanlineIterator() {
crossings = new int[INIT_CROSSINGS_SIZE];

// We don't care if we clip some of the line off with ceil, since
// no scan line crossings will be eliminated (in fact, the ceil is
// the y of the first scan line crossing).
final int minY = getFirstScanLineCrossing();
nextY = minY;
maxY = Math.min(boundsMaxY, (int)Math.ceil(edgeMaxY));
}

private int next() {
// TODO: make function that convert from y value to bucket idx?
int bucket = nextY - boundsMinY;
final int numLeft = removeEdgesBelowY(bucket, edgeBuckets[bucket], nextY);

crossings = Helpers.widenArray(crossings, 0, numLeft);

for (int ecur = edgeBuckets[bucket]; ecur != NULL; ecur = (int)edges[ecur+NEXT]) {
makeSortedByCurX(bucket, ecur);
}

int crossingIdx = 0;
// Now every edge between lo and hi crosses nextY. Compute it's
// crossing and put it in the crossings array.
int prev = NULL;
for (int ecur = edgeBuckets[bucket]; ecur != NULL; ecur = (int)edges[ecur+NEXT]) {
   

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-28 Thread Jim Graham

Hi Denis,

Good news!

On 10/28/2010 3:27 PM, Denis Lila wrote:

If we moved to a Curve class or some other way to
consolidate the 3 lists (may be easier in native code), this might win
in more cases...


Does that mean you no longer think we should flatten every curve as soon
as we get it?


No, I was just discussion the feasibility of that one suggestion in the 
context of all of the possibilities of whether or not we took the other 
choices.  If you think that flattening will pay off then we don't have 
to worry about the 3 lists.  It was just that when I hacked it in, I 
still had your 3 lists and so the pros and cons of horizontal edge 
sorting were relative to that version of the renderer...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-28 Thread Denis Lila
Hi Jim.

I removed the cubic and quadratic lists and am now flattening
everything as it comes in, like you suggested. I ran some AA
benchmarks and the curves test was about 15% faster.

Then I started using your insertion sort in the only edges version
and re ran the benchmarks. Curves improved from 15% better to 20-22%
and lines improved from nothing to 2-20%, depending on how vertical
the lines were.

I didn't expect that the sorting would make this much difference.
But, it does, so I think it is worth it to work more on it; therefore,
next I will implement the bucket sort you suggested.

> If we moved to a Curve class or some other way to 
> consolidate the 3 lists (may be easier in native code), this might win
> in more cases...

Does that mean you no longer think we should flatten every curve as soon
as we get it?

Regards,
Denis.

- "Jim Graham"  wrote:

> Hi Denis,
> 
> On 10/26/2010 6:58 AM, Denis Lila wrote:
> >> 90% (guesstimate) of the time edges do not cross each other, thus
> if
> >> you sort the crossings without reordering the active edges then you
> just
> >> end up doing the same sorting work (same swaps) on the next
> scanline.  My
> >> SpanShapeIterator code actually reordered the edges on every
> sample
> >> line to match their current X coordinates in a way that involved 1
> compare
> >> per edge that was processed and only occasionally a swap of 2 edge
> >> pointers.  It would basically eliminate the sort at line 149 at
> the
> >> cost of only doing a compare in the nextY processing for the very
> very
> >> common case.
> >
> > I also implemented this some time ago. I didn't keep it because it
> slowed
> > things down a bit. However, I only tested it with very complex and
> large
> > paths, and in hindsight, I shouldn't have based my decision on that,
> so I
> > will re-implement it.
> 
> I tried using this new rasterizer in FX and I had a test case which
> had 
> a few shapes that were essentially zig-zaggy shapes on the top and 
> bottom and fills between the zigzags (kind of like a seismic chart
> with 
> fills between the pen squiggles).
> 
> When I added a simple sort of the linear edges the performance nearly
> 
> doubled in speed.  Here is the code I replaced your quad-next-edge
> loop 
> with:
> 
>  for (int i = elo; i < ehi; i++) {
>  int ptr = edgePtrs[i];
>  int cross = ((int) edges[ptr+CURX]) << 1;
>  if (edges[ptr+OR] > 0) {
>  cross |= 1;
>  }
>  edges[ptr+CURY] += 1;
>  edges[ptr+CURX] += edges[ptr+SLOPE];
>  int j = crossingIdx;
>  while (--j >= 0) {
>  int jcross = crossings[j];
>  if (cross >= jcross) {
>  break;
>  }
>  crossings[j+1] = jcross;
>  edgePtrs[elo+j+1] = edgePtrs[elo+j];
>  }
>  crossings[j+1] = cross;
>  edgePtrs[elo+j+1] = ptr;
>  crossingIdx++;
>  }
> 
> I then did a conditional sort, moved to right after the qlo->qhi and 
> clo->chi loops:
> 
>  for (int i = qlo; i < qhi; i++) {
>  // same stuff
>  }
>  for (int i = clo; i < chi; i++) {
>  // same stuff
>  }
>  if (qhi > qlo || chi > clo) {
>  Arrays.sort(crossings, 0, crossingIdx);
>  }
> 
> In the case of my test case where I only had a polygon to fill, the 
> performance doubled.  Obviously I didn't do much for the case where 
> there were curves because this was just a quick hack to see the value
> of sorting the edges.
> 
>   ...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-27 Thread Jim Graham

Hi Denis,

On 10/26/2010 6:58 AM, Denis Lila wrote:

90% (guesstimate) of the time edges do not cross each other, thus if
you sort the crossings without reordering the active edges then you just
end up doing the same sorting work (same swaps) on the next scanline.  My
SpanShapeIterator code actually reordered the edges on every sample
line to match their current X coordinates in a way that involved 1 compare
per edge that was processed and only occasionally a swap of 2 edge
pointers.  It would basically eliminate the sort at line 149 at the
cost of only doing a compare in the nextY processing for the very very
common case.


I also implemented this some time ago. I didn't keep it because it slowed
things down a bit. However, I only tested it with very complex and large
paths, and in hindsight, I shouldn't have based my decision on that, so I
will re-implement it.


I tried using this new rasterizer in FX and I had a test case which had 
a few shapes that were essentially zig-zaggy shapes on the top and 
bottom and fills between the zigzags (kind of like a seismic chart with 
fills between the pen squiggles).


When I added a simple sort of the linear edges the performance nearly 
doubled in speed.  Here is the code I replaced your quad-next-edge loop 
with:


for (int i = elo; i < ehi; i++) {
int ptr = edgePtrs[i];
int cross = ((int) edges[ptr+CURX]) << 1;
if (edges[ptr+OR] > 0) {
cross |= 1;
}
edges[ptr+CURY] += 1;
edges[ptr+CURX] += edges[ptr+SLOPE];
int j = crossingIdx;
while (--j >= 0) {
int jcross = crossings[j];
if (cross >= jcross) {
break;
}
crossings[j+1] = jcross;
edgePtrs[elo+j+1] = edgePtrs[elo+j];
}
crossings[j+1] = cross;
edgePtrs[elo+j+1] = ptr;
crossingIdx++;
}

I then did a conditional sort, moved to right after the qlo->qhi and 
clo->chi loops:


for (int i = qlo; i < qhi; i++) {
// same stuff
}
for (int i = clo; i < chi; i++) {
// same stuff
}
if (qhi > qlo || chi > clo) {
Arrays.sort(crossings, 0, crossingIdx);
}

In the case of my test case where I only had a polygon to fill, the 
performance doubled.  Obviously I didn't do much for the case where 
there were curves because this was just a quick hack to see the value of 
sorting the edges.  If we moved to a Curve class or some other way to 
consolidate the 3 lists (may be easier in native code), this might win 
in more cases...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-26 Thread Jim Graham

On 10/26/2010 6:58 AM, Denis Lila wrote:

If we are really worried about the y-sort, then how about creating a
bunch of buckets and doing a bucket sort of the edges?  As they are
added to the list of segments, we accumulate their indices in a row
list based on their startY so that each step of the "next()" simply moves
to the next Y and adds the edges mentioned in the list there.  Some work
would have to be done on how to manage the storage simply (like a
"rownext" field in the edge structure so that they are linked in a
linked list), but then there would be no qsort at all for the cost of
new int[N_ROWS] and an extra field in every edge.


I like this.


It's even more frugal if combined with my idea that an entire full pixel 
row could be processed in one batch rather than processing each 
sub-pixel row separately.



90% (guesstimate) of the time edges do not cross each other, thus if
you sort the crossings without reordering the active edges then you just
end up doing the same sorting work (same swaps) on the next scanline.  My
SpanShapeIterator code actually reordered the edges on every sample
line to match their current X coordinates in a way that involved 1 compare
per edge that was processed and only occasionally a swap of 2 edge
pointers.  It would basically eliminate the sort at line 149 at the
cost of only doing a compare in the nextY processing for the very very
common case.


I also implemented this some time ago. I didn't keep it because it slowed
things down a bit. However, I only tested it with very complex and large
paths, and in hindsight, I shouldn't have based my decision on that, so I
will re-implement it.


First, this is less helpful if we process the edges/quads/curves in 3 
separate batches because you'd still have 3 sets of crossings to 
interweave and so some sorting would still be necessary, but keeping 
each of the 3 sets sorted amongst themselves might still help with 
reducing the swapping work of the sorting step.  This technique would 
work best if all of the edge types were processed from one list.


Also, be sure to test with both clockwise and couterclockwise versions 
of the test paths just in case.  The (constantly re)sorting hit will be 
more expensive with one than the other since one of them will have the 
edges already in the sorted order and the other would require maximum 
swaps.  I'd check:


Simple CW rectangle
Simple CCW rectangle
CW and CCW rectangles with 500(?) zig-zags along top edge
CW and CCW rectangles with 500(?) zig-zags along both top and bottom

And a note related to another suggestion I'd made, I think if we did 
something like "process an entire pixel row at once" it would complicate 
the "auto-edge sorting" a bit.  If every curve is iterated to the bottom 
of the pixel in turn before moving to the next edge then its crossings 
may swap with another edge's crossings, but that next edge would not be 
processed yet and then there may be a mish-mosh of N subpixel values 
from the two edges to interweave.  If it was done by processing each 
subpixel row in turn then the edges could be kept sorted as the subpixel 
rows were iterated, but it would reduce the savings on the overhead of 
swapping between edges.  The best compromise might be to process every 
curve fully, verify its sorted position in the edge array using its X 
coordinate at the bottom of the pixel, still sort the whole pixel row's 
worth of crossing values, but hopefully the sorting would have very 
little work to do if the original values were "mostly sorted" by having 
kept the edges in order at pixel boundaries (a "mostly sorted" friendly 
sort algorithm, like insertion sort, might be needed).


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-26 Thread Denis Lila
Hi Jim.

> Just to be certain - you are still planning on putting the existing 
> stuff back and we're talking about future work, right?  I'd love to
> get a stake in the ground here.

Yes, I'll push today.

> If we are really worried about the y-sort, then how about creating a 
> bunch of buckets and doing a bucket sort of the edges?  As they are 
> added to the list of segments, we accumulate their indices in a row
> list based on their startY so that each step of the "next()" simply moves
> to the next Y and adds the edges mentioned in the list there.  Some work
> would have to be done on how to manage the storage simply (like a 
> "rownext" field in the edge structure so that they are linked in a 
> linked list), but then there would be no qsort at all for the cost of
> new int[N_ROWS] and an extra field in every edge.

I like this.

> Perhaps we should work on the algorithms a little more then (I'm
> talking about the numeric stuff, not the memory management stuff since the 
> memory management techniques will differ quite a lot in C code, but 
> better math helps at either level)?

Indeed - especially the dashing.

> 90% (guesstimate) of the time edges do not cross each other, thus if
> you sort the crossings without reordering the active edges then you just
> end up doing the same sorting work (same swaps) on the next scanline.  My
> SpanShapeIterator code actually reordered the edges on every sample
> line to match their current X coordinates in a way that involved 1 compare
> per edge that was processed and only occasionally a swap of 2 edge 
> pointers.  It would basically eliminate the sort at line 149 at the
> cost of only doing a compare in the nextY processing for the very very
> common case.

I also implemented this some time ago. I didn't keep it because it slowed
things down a bit. However, I only tested it with very complex and large
paths, and in hindsight, I shouldn't have based my decision on that, so I
will re-implement it.

Regards,
Denis.

- "Jim Graham"  wrote:

> Hi Denis,
> 
> On 10/25/2010 3:30 PM, Denis Lila wrote:
> >> - Create a curve class and store an array of those so you don't
> have
> >> to iterate 3 different arrays of values and use array accesses to
> grab
> >> the data (each array access is checked for index OOB exceptions).
> >
> > I actually implemented something like this in my first draft of the
> current
> > version of renderer. I didn't stick with it because it was a slower
> than
> > even what we have in openjdk6. Granted, my implementation was a bit
> more
> > high level than simply making 3 classes to represent lines quads and
> cubics,
> > but it seemed pretty hopeless, so I didn't spend any time figuring
> out
> > exactly what it was that made it slower.
> 
> Hmmm...  Perhaps object allocation overhead was biting us there.  In 
> native code you could cobble this up with batch allocation of space
> and 
> pseudo-object/struct allocation out of the batches.
> 
> >> - Or perform AFD on curves as they come into Renderer, but only
> save
> >> line segment edges in the edges array.  This would use more
> storage,
> >> but the iterations of the AFD would happen in a tight loop as the
> data
> >> comes in rather than having to store all of their AFD data back in
> the quads
> >
> > I considered this before abandoning the high level version I mention
> above.
> > I didn't go with it because, while I am not worried about the higher
> memory
> > consumption, I was afraid of the impact that having this many edges
> would
> > have on the qsort call and on lines 99-113 and 140-148 in next().
> 
> I can see worrying about qsort, but I think one qsort would be 
> inherently faster than 3 qsorts which you have anyway so the
> difference 
> would get lost in the noise.  Also, I'm not sure how the 99 to 113 and
> 
> 140 to 148 would be affected.  The path will have the same number of 
> crossings per sample row regardless of whether the curves have been 
> flattened or not.  You might be adding and deleting edges from the 
> active list more often, though (in other words, 99 would dump more 
> curves and 140 would take in more curves), but the number of edges or
> 
> curves at any given point would not be affected by flattening.  Also,
> 
> the way you've written the loops at 99, they have to copy every 
> edge/quad/curve that *doesn't* expire so a skipped curve is actually 
> less work for that loop.  The loops at 140 would have to occasionally
> do 
> more processing, but it is made up for in the fact that 99 does less 
> work and the "nextY" processing is simpler.
> 
> > How about this: we change the format of the edges array to be an
> array of
> > sequences of edges. So, right now the edges array has this format:
> > E* where E represents 6 consecutive floats
> {ymin,ymax,curx,cury,or,slope}.
> > I am proposing we change it to
> {n,or}({ymin,ymax,curx,cury,slope})^n.
> > lineTo's would add an edge sequence with n=1 to the edges array. If
> a call
> > to quadTo or c

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-25 Thread Jim Graham

Hi Denis,

Just to be certain - you are still planning on putting the existing 
stuff back and we're talking about future work, right?  I'd love to get 
a stake in the ground here.


On 10/25/2010 3:30 PM, Denis Lila wrote:

- Create a curve class and store an array of those so you don't have
to iterate 3 different arrays of values and use array accesses to grab
the data (each array access is checked for index OOB exceptions).


I actually implemented something like this in my first draft of the current
version of renderer. I didn't stick with it because it was a slower than
even what we have in openjdk6. Granted, my implementation was a bit more
high level than simply making 3 classes to represent lines quads and cubics,
but it seemed pretty hopeless, so I didn't spend any time figuring out
exactly what it was that made it slower.


Hmmm...  Perhaps object allocation overhead was biting us there.  In 
native code you could cobble this up with batch allocation of space and 
pseudo-object/struct allocation out of the batches.



- Or perform AFD on curves as they come into Renderer, but only save
line segment edges in the edges array.  This would use more storage,
but the iterations of the AFD would happen in a tight loop as the data
comes in rather than having to store all of their AFD data back in the quads


I considered this before abandoning the high level version I mention above.
I didn't go with it because, while I am not worried about the higher memory
consumption, I was afraid of the impact that having this many edges would
have on the qsort call and on lines 99-113 and 140-148 in next().


I can see worrying about qsort, but I think one qsort would be 
inherently faster than 3 qsorts which you have anyway so the difference 
would get lost in the noise.  Also, I'm not sure how the 99 to 113 and 
140 to 148 would be affected.  The path will have the same number of 
crossings per sample row regardless of whether the curves have been 
flattened or not.  You might be adding and deleting edges from the 
active list more often, though (in other words, 99 would dump more 
curves and 140 would take in more curves), but the number of edges or 
curves at any given point would not be affected by flattening.  Also, 
the way you've written the loops at 99, they have to copy every 
edge/quad/curve that *doesn't* expire so a skipped curve is actually 
less work for that loop.  The loops at 140 would have to occasionally do 
more processing, but it is made up for in the fact that 99 does less 
work and the "nextY" processing is simpler.



How about this: we change the format of the edges array to be an array of
sequences of edges. So, right now the edges array has this format:
E* where E represents 6 consecutive floats {ymin,ymax,curx,cury,or,slope}.
I am proposing we change it to {n,or}({ymin,ymax,curx,cury,slope})^n.
lineTo's would add an edge sequence with n=1 to the edges array. If a call
to quadTo or curveTo produced N curves, it would simply put N,or at the
end of the edges array, and then append the data for the N produced edges.
I think this would give us the best of both worlds, in that we can do all
the AFD iterations in a tight loop close to the input methods and it
doesn't present any problems with respect to sorting or managing the
active list. It can probably be implemented completely transparently with
respect to ScanlineIterator. The details of
the implementation involve an interesting speed/memory trade off, but we
can discuss that later.


I think this might be overkill since sorts tend to have logN behavior so 
doubling the number of edges would not double the time taken in the 
sort.  Also, I would think that the sort would be a small amount of time 
compared to the rest of the processing, wasn't it?


If we are really worried about the y-sort, then how about creating a 
bunch of buckets and doing a bucket sort of the edges?  As they are 
added to the list of segments, we accumulate their indices in a row list 
based on their startY so that each step of the "next()" simply moves to 
the next Y and adds the edges mentioned in the list there.  Some work 
would have to be done on how to manage the storage simply (like a 
"rownext" field in the edge structure so that they are linked in a 
linked list), but then there would be no qsort at all for the cost of 
new int[N_ROWS] and an extra field in every edge.



- Convert to native.  Note that we use a native version of the pisces
that you started with to do some rendering in FX.  I tried porting to
use your new (Java) renderer in FX and performance went down even
though you show it to be faster than what was there before.  So, your Java
renderer compares favorably to the old Java pisces, but both compare
unfavorably to the old native pisces.  Maybe we should convert your
code to native and see if that gives us a performance boost.  It's nice to
use pure Java, but there is a lot of "shoehorning" of data going on
here that could be done much more ea

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-25 Thread Denis Lila
Hi Jim.

> - Create a curve class and store an array of those so you don't have
> to iterate 3 different arrays of values and use array accesses to grab
> the data (each array access is checked for index OOB exceptions).

I actually implemented something like this in my first draft of the current
version of renderer. I didn't stick with it because it was a slower than
even what we have in openjdk6. Granted, my implementation was a bit more
high level than simply making 3 classes to represent lines quads and cubics,
but it seemed pretty hopeless, so I didn't spend any time figuring out
exactly what it was that made it slower.

> - Or perform AFD on curves as they come into Renderer, but only save 
> line segment edges in the edges array.  This would use more storage,
> but the iterations of the AFD would happen in a tight loop as the data
> comes in rather than having to store all of their AFD data back in the quads

I considered this before abandoning the high level version I mention above.
I didn't go with it because, while I am not worried about the higher memory
consumption, I was afraid of the impact that having this many edges would
have on the qsort call and on lines 99-113 and 140-148 in next().
How about this: we change the format of the edges array to be an array of
sequences of edges. So, right now the edges array has this format:
E* where E represents 6 consecutive floats {ymin,ymax,curx,cury,or,slope}.
I am proposing we change it to {n,or}({ymin,ymax,curx,cury,slope})^n.
lineTo's would add an edge sequence with n=1 to the edges array. If a call
to quadTo or curveTo produced N curves, it would simply put N,or at the
end of the edges array, and then append the data for the N produced edges.
I think this would give us the best of both worlds, in that we can do all
the AFD iterations in a tight loop close to the input methods and it
doesn't present any problems with respect to sorting or managing the
active list. It can probably be implemented completely transparently with
respect to ScanlineIterator. The details of
the implementation involve an interesting speed/memory trade off, but we
can discuss that later.

> - Convert to native.  Note that we use a native version of the pisces
> that you started with to do some rendering in FX.  I tried porting to
> use your new (Java) renderer in FX and performance went down even
> though you show it to be faster than what was there before.  So, your Java 
> renderer compares favorably to the old Java pisces, but both compare 
> unfavorably to the old native pisces.  Maybe we should convert your
> code to native and see if that gives us a performance boost.  It's nice to
> use pure Java, but there is a lot of "shoehorning" of data going on
> here that could be done much more easily and naturally in native code.

I've been wanting to do this for a very long time. C and C++ are more
convenient for this type of work. I didn't because I've never used the JNI
before. I guess this is as good a time to learn as any. I'd still like to
keep the debugging of native code to a minimum, so we should implement
as much as possible in java before starting on this.


I still need to think some more about your other suggestion. I'll reply
to it tomorrow.

> How is that for "food for thought"?
Delicious :)

Regards,
Denis.

- "Jim Graham"  wrote:

> Hi Denis,
> 
> On 10/25/2010 7:34 AM, Denis Lila wrote:
> >> (and I have some ideas on further optimizations to consider if you
> are still
> >> game after this goes in)...
> >
> > I'd love to hear what they are.
> 
> Here are my thoughts:
> 
> - Currently Renderer has more stages than we probably should have:
>  for (each full pixel row) {
>  for (each subpixel row) {
>  for (each curve on subpixel row) {
>  convert curve into crossing
>  crossing is (subpixelx:31 + dir:1)
>  }
>  sort crossings for subpixel row
>  apply wind rule and convert crossings into alpha deltas
>  }
>  convert alpha deltas into run lengths
>  }
>  for (each tile) {
>  convert run lengths into alphas
>  }
> I'm thinking we should be able to get rid of a couple of those
> stages...
> 
> - One alternate process would be:
>  (all work is done in the tail end of the cache itself)
>  for (each full pixel row) {
>  for (each curve on full pixel row) {
>  convert curve into N subpixel crossings
>  (subpixelx:28 + subpixelfracty:3 + dir:1)
>  }
>  }
>  sort all crossings for entire pixel row
>  maybe collapse raw crossings into modified accumulated crossings
>  record start of row in index array
>  for (each tile) {
>  convert raw or collapsed crossings directly into alphas
>  }
> Note that we could simply leave the raw crossings in the cache and
> then 
> process them in the tile generator, but that would require more
> storage 
> in the cache sin

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-25 Thread Jim Graham

Hi Denis,

On 10/25/2010 7:34 AM, Denis Lila wrote:

(and I have some ideas on further optimizations to consider if you are still
game after this goes in)...


I'd love to hear what they are.


Here are my thoughts:

- Currently Renderer has more stages than we probably should have:
for (each full pixel row) {
for (each subpixel row) {
for (each curve on subpixel row) {
convert curve into crossing
crossing is (subpixelx:31 + dir:1)
}
sort crossings for subpixel row
apply wind rule and convert crossings into alpha deltas
}
convert alpha deltas into run lengths
}
for (each tile) {
convert run lengths into alphas
}
I'm thinking we should be able to get rid of a couple of those stages...

- One alternate process would be:
(all work is done in the tail end of the cache itself)
for (each full pixel row) {
for (each curve on full pixel row) {
convert curve into N subpixel crossings
(subpixelx:28 + subpixelfracty:3 + dir:1)
}
}
sort all crossings for entire pixel row
maybe collapse raw crossings into modified accumulated crossings
record start of row in index array
for (each tile) {
convert raw or collapsed crossings directly into alphas
}
Note that we could simply leave the raw crossings in the cache and then 
process them in the tile generator, but that would require more storage 
in the cache since a given path would tend to have 8 different entries 
as it crossed every subpixel row.  If we collapse them, then we can sum 
the entries for a given x coordinate into a single entry and store:

(pixelx:25 + coveragedelta:7)
where the coverage delta is a signed quantity up to N_SUBPIX^2 so it 
uses the same storage we needed for the subpixel parts of both X and Y 
plus the direction bit.  It likely needs a paired value in the next x 
pixel location just like our current "alpha deltas" needs as well.  If 
we wanted to optimize that then we could devote one more bit to "the 
next pixel will consume all of the remainder of the N^2 coverage", but 
there would be cases where that would not be true (such as if the pixel 
row has only partial vertical coverage by the path).  It's probably 
simpler to just have deltas for every pixel.


The storage here would likely be similar to the storage used for the 
current cache since the current RLE cache uses 2 full ints to store a 
coverage and a count.  And in cases where we have one pixel taking up 
partial coverage and the following pixel taking up the remainder of the 
full coverage then we have 4 ints, but the "crossing delta" system would 
only have 2 ints.


Other thoughts...

- Create a curve class and store an array of those so you don't have to 
iterate 3 different arrays of values and use array accesses to grab the 
data (each array access is checked for index OOB exceptions).


- Or perform AFD on curves as they come into Renderer, but only save 
line segment edges in the edges array.  This would use more storage, but 
the iterations of the AFD would happen in a tight loop as the data comes 
in rather than having to store all of their AFD data back in the quads 
and curves arrays and then reload the data for every sub-pixel step. 
Renderer still takes curves, it just breaks them down immediately rather 
than on the fly.  If there are only a small number of edges per curve 
then the storage might not be that much worse because the quad and curve 
arrays already store more values than the edge array.


- Convert to native.  Note that we use a native version of the pisces 
that you started with to do some rendering in FX.  I tried porting to 
use your new (Java) renderer in FX and performance went down even though 
you show it to be faster than what was there before.  So, your Java 
renderer compares favorably to the old Java pisces, but both compare 
unfavorably to the old native pisces.  Maybe we should convert your code 
to native and see if that gives us a performance boost.  It's nice to 
use pure Java, but there is a lot of "shoehorning" of data going on here 
that could be done much more easily and naturally in native code.


How is that for "food for thought"?

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-25 Thread Denis Lila
> That's great. I will be pushing today.

About that: you wrote the TransformingPathConsumer2D file,
so how should you be given credit? Should I put your name in
"Contributed-by"? Should I put an @author tag in the file?
Or does the "reviewed-by" suffice?

Regards,
Denis.

- "Denis Lila"  wrote:

> Hi Jim.
> 
> > How about this:
> > 
> > (Math.abs(len-leftLen) < err*len)
> > 
> > (noting that err*len can be calculated outside of the loop).
> 
> This is what I use now.
> 
> > Note that a custom shape can send segments in any order that it
> wants
> > so close,close can happen from a custom shape even if Path2D won't
> do
> > it.
> 
> Right, of course. For some reason I only considered the test case I
> was
> using. Sorry for slip up.
> 
> > There is only one question on the board from my perspective - the 
> > question about dash length errors at the start of this message. 
> 
> I'm ok with the accuracy of dash length computations. My main
> concerns
> right now are dashing performance and AA rendering performance. The
> latter has improved, but not by as much as I was expecting. Also, it
> was
> bothering me that when I removed PiscesCache 1-2 weeks ago
> performance
> got worse.
> It would also be nice to find a method that is guaranteed to find
> all offset curve cusps.
> 
> > Depending on how you feel about that I think we're ready to go
> 
> That's great. I will be pushing today.
> 
> > (and I have some ideas on further optimizations to consider if you
> are still
> > game after this goes in)...
> 
> I'd love to hear what they are.
> 
> Thank you for all your time,
> Denis.
> 
> - "Jim Graham"  wrote:
> 
> > On 10/22/2010 12:22 PM, Denis Lila wrote:
> > > Because the error is meant to be relative. What I use is supposed
> to
> > be
> > > equivalent to difference/AverageOfCorrectValueAndApproximation< 
> > err,
> > > where, in our case,
> > AverageOfCorrectValueAndApproximation=(len+leafLen)/2,
> > > so that multiplication by 2 should have been a division.
> > > Should I use the less fancy differece/CorrectValue<  err and skip
> > > a division by 2?
> > 
> > If it is relative, then shouldn't it be relative to the desired
> > length? 
> >   Why does the computed approximation factor in to the size of the
> > error 
> > you are looking for?  If you use the average then the amount of
> error
> > 
> > that you will "accept" will be larger if your estimate is wronger. 
> I
> > 
> > don't think the "wrongness" of your approximation should have any
> > effect 
> > on your error.
> > 
> > >> lines 98-101 - somewhere in the confusion moveToImpl became
> > redundant
> > >> with moveTo.
> > >
> > > I thought I'd leave these in because they're interface methods,
> > > and it's usually not a great idea to call these from private
> > methods. I've
> > > removed them anyway, because you said so and because I suppose
> > > if ever we need to do something to the user input that shouldn't
> be
> > done
> > > to input coming from private methods in the class (such as the
> > scaling
> > > of the input coordinates done in Renderer) we can always easily
> > > reintroduce them.
> > 
> > I actually thought about the interface concept after I sent that
> and
> > was 
> > at peace with them, but I'm also at peace with them being gone. ;-)
> > 
> > > That's right. I don't think it's what should happen, but it's
> what
> > closed
> > > jre does, so I've left it in (with some changes to make it
> actually
> > > replicate the behaviour of closed java, since it was buggy - the
> > moveTo
> > > was missing). I don't draw anything on the second close in the
> > close,close
> > > case, since that would look horrible with round joins and square
> > caps.
> > > However, the way path2D's are implemented this safeguard will not
> > be
> > > activated since a closePath() following another closePath() is
> > ignored.
> > 
> > > I also now initialize the *dxy *mxy variables in moveTo. This
> should
> > be an
> > > inconsequential change, but I've put it in just so the state of
> > > Stroker after a moveTo is well defined.
> > 
> > New code looks good (even if we think it's a silly side effect of
> > closed 
> > JDK to have to implement).
> > 
> > >> line 368 - does this cause a problem if t==1 because lastSegLen
> > will
> > >> now return the wrong value?  Maybe move this into an else clause
> on
> > the
> > >> "t>=1" test?
> > >
> > > It does cause a problem. I've fixed it by adding a variable that
> > keeps track
> > > of the length of the last returned segment. The way it was being
> > done was
> > > a dirty hack because it assumed that if the method didn't return
> in
> > the loop,
> > > done would remain false.
> > 
> > Woohoo!  I was actually bothered by the side-effecting in the old
> code
> > - 
> > this is a better approach.
> > 
> > > I made one more change in dasher: in somethingTo I removed the
> long
> > comment
> > > near the end, and I handle the case where phase == dash[idx]
> > immediately.
> > > I do this for con

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-25 Thread Denis Lila
Hi Jim.

> How about this:
> 
>   (Math.abs(len-leftLen) < err*len)
> 
> (noting that err*len can be calculated outside of the loop).

This is what I use now.

> Note that a custom shape can send segments in any order that it wants
> so close,close can happen from a custom shape even if Path2D won't do
> it.

Right, of course. For some reason I only considered the test case I was
using. Sorry for slip up.

> There is only one question on the board from my perspective - the 
> question about dash length errors at the start of this message. 

I'm ok with the accuracy of dash length computations. My main concerns
right now are dashing performance and AA rendering performance. The
latter has improved, but not by as much as I was expecting. Also, it was
bothering me that when I removed PiscesCache 1-2 weeks ago performance
got worse.
It would also be nice to find a method that is guaranteed to find
all offset curve cusps.

> Depending on how you feel about that I think we're ready to go

That's great. I will be pushing today.

> (and I have some ideas on further optimizations to consider if you are still
> game after this goes in)...

I'd love to hear what they are.

Thank you for all your time,
Denis.

- "Jim Graham"  wrote:

> On 10/22/2010 12:22 PM, Denis Lila wrote:
> > Because the error is meant to be relative. What I use is supposed to
> be
> > equivalent to difference/AverageOfCorrectValueAndApproximation< 
> err,
> > where, in our case,
> AverageOfCorrectValueAndApproximation=(len+leafLen)/2,
> > so that multiplication by 2 should have been a division.
> > Should I use the less fancy differece/CorrectValue<  err and skip
> > a division by 2?
> 
> If it is relative, then shouldn't it be relative to the desired
> length? 
>   Why does the computed approximation factor in to the size of the
> error 
> you are looking for?  If you use the average then the amount of error
> 
> that you will "accept" will be larger if your estimate is wronger.  I
> 
> don't think the "wrongness" of your approximation should have any
> effect 
> on your error.
> 
> >> lines 98-101 - somewhere in the confusion moveToImpl became
> redundant
> >> with moveTo.
> >
> > I thought I'd leave these in because they're interface methods,
> > and it's usually not a great idea to call these from private
> methods. I've
> > removed them anyway, because you said so and because I suppose
> > if ever we need to do something to the user input that shouldn't be
> done
> > to input coming from private methods in the class (such as the
> scaling
> > of the input coordinates done in Renderer) we can always easily
> > reintroduce them.
> 
> I actually thought about the interface concept after I sent that and
> was 
> at peace with them, but I'm also at peace with them being gone. ;-)
> 
> > That's right. I don't think it's what should happen, but it's what
> closed
> > jre does, so I've left it in (with some changes to make it actually
> > replicate the behaviour of closed java, since it was buggy - the
> moveTo
> > was missing). I don't draw anything on the second close in the
> close,close
> > case, since that would look horrible with round joins and square
> caps.
> > However, the way path2D's are implemented this safeguard will not
> be
> > activated since a closePath() following another closePath() is
> ignored.
> 
> > I also now initialize the *dxy *mxy variables in moveTo. This should
> be an
> > inconsequential change, but I've put it in just so the state of
> > Stroker after a moveTo is well defined.
> 
> New code looks good (even if we think it's a silly side effect of
> closed 
> JDK to have to implement).
> 
> >> line 368 - does this cause a problem if t==1 because lastSegLen
> will
> >> now return the wrong value?  Maybe move this into an else clause on
> the
> >> "t>=1" test?
> >
> > It does cause a problem. I've fixed it by adding a variable that
> keeps track
> > of the length of the last returned segment. The way it was being
> done was
> > a dirty hack because it assumed that if the method didn't return in
> the loop,
> > done would remain false.
> 
> Woohoo!  I was actually bothered by the side-effecting in the old code
> - 
> this is a better approach.
> 
> > I made one more change in dasher: in somethingTo I removed the long
> comment
> > near the end, and I handle the case where phase == dash[idx]
> immediately.
> > I do this for consistency with lineTo. The only instances where this
> makes
> > a difference is when we have a path that starts with a dash, ends at
> exactly
> > the end of a dash, and has a closePath. So something like
> move(100,0),
> > line(0,0),line(0,50),line(100,50),line(100,0),close() with a dash
> pattern {60,60}.
> > In closed jdk (and also in openjdk with my patch), no join would be
> drawn
> > at 100,0 on this path with this dash pattern.
> > Whether that's correct behaviour is arguable, imho. On one hand, if
> we don't do it
> > , but I think it is because if
> > it isn't done certain dashed re

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-25 Thread Denis Lila
Hi Jim.

> It's interesting to note that non-AA dashed ovals took a
> much bigger hit than AA dashed ovals so we need to see which code is 
> fielding those and see what its issue is.

I think that's because when AA is used the tests take more time, so
a smaller proportion of time is being spent in Dasher so all that is
needed for us to see this pattern is for AA to not have gotten too
much worse (which it hasn't - it's faster in fact).

Regards,
Denis.

- "Jim Graham"  wrote:

> Hi Denis,
> 
> Interesting results.  Note that sun-jdk7 is already showing some 
> regressions so the more intesting results are the old-to-new pisces 
> comparisons.
> 
> I'd stick with the new algorithm and lets look for ways to make dashed
> 
> curves go faster.  I know there are better "curve length" algorithms
> we 
> could incorporate, but let's get a stake in the ground before we 
> consider going back to flattening...
> 
>   ...jim
> 
> On 10/22/2010 2:25 PM, Denis Lila wrote:
> > Hi Jim.
> >
> >> I was going to run these today, but fixing the dashing bug above
> and
> >> rerunning the tests took a while and it's already 8:30 pm here and
> I
> >> have to go home. I'll run them tomorrow morning.
> >
> > I ran the benchmarks. I've attached the options file. I ran
> benchmarks
> > of my icedtea installation, closed source java, openjdk7 without
> the
> > webrev, and openjdk7 with the webrev.
> >
> > The results files are at
> http://icedtea.classpath.org/~dlila/benchResults/
> > I think the names are pretty self explanatory.
> >
> > The html comparisons are:
> > jdk6 vs latest work:
> >
> http://icedtea.classpath.org/~dlila/benchResults/JDK6vsLatest_html/Summary_Report.html
> >
> > closed source vs latest work:
> >
> http://icedtea.classpath.org/~dlila/benchResults/SUNvsLatest_html/Summary_Report.html
> >
> > and most importantly:
> > previous version of pisces in openjdk7 vs latest work:
> >
> http://icedtea.classpath.org/~dlila/benchResults/PrevVsLatest_html/Summary_Report.html
> >
> > The improvements are significant. Running J2DAnalyzer on all the
> results files with
> > jdk6Bench.res as the basis produces the following summary:
> >
> > Summary:
> >jdk6Bench:
> >  Number of tests:  104
> >  Overall average:  30.7986576862
> >  Best spread:  0.0% variance
> >  Worst spread: 10.96% variance
> >  (Basis for results comparison)
> >
> >sunBench:
> >  Number of tests:  104
> >  Overall average:  276654.4443479696
> >  Best spread:  0.25% variance
> >  Worst spread: 19.28% variance
> >  Comparison to basis:
> >Best result:  6488.34% of basis
> >Worst result: 43.74% of basis
> >Number of wins:   80
> >Number of ties:   2
> >Number of losses: 22
> >
> >prevPisces:
> >  Number of tests:  104
> >  Overall average:  221539.3516605794
> >  Best spread:  0.08% variance
> >  Worst spread: 5.54% variance
> >  Comparison to basis:
> >Best result:  350.33% of basis
> >Worst result: 55.0% of basis
> >Number of wins:   57
> >Number of ties:   10
> >Number of losses: 37
> >
> >latestPisces:
> >  Number of tests:  104
> >  Overall average:  226762.64157611743
> >  Best spread:  0.0% variance
> >  Worst spread: 3.03% variance
> >  Comparison to basis:
> >Best result:  501.86% of basis
> >Worst result: 26.23% of basis
> >Number of wins:   72
> >Number of ties:   4
> >Number of losses: 28
> >
> >
> > But unfortunately, if you look at the individual test cases in the
> html
> > reports there are also some stepbacks, most notably in the dashing
> of
> > anything that isn't a straight line. In fact, the results of
> drawOval
> > show a 50%-500% improvement in non dashed drawing, and a 50%-25%
> deterioration
> > in dashed drawing. I was expecting this, after the binary search
> algorithm.
> > I've been thinking it might be better if we just go with the old
> algorithm
> > and simply use Dasher.LengthIterator to flatten the curves and feed
> the lines
> > to lineTo. Or we could just go with what we have and hope we find a
> better
> > algorithm.
> >
> > Regards,
> > Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-22 Thread Jim Graham

On 10/22/2010 12:22 PM, Denis Lila wrote:

Because the error is meant to be relative. What I use is supposed to be
equivalent to difference/AverageOfCorrectValueAndApproximation<  err,
where, in our case, AverageOfCorrectValueAndApproximation=(len+leafLen)/2,
so that multiplication by 2 should have been a division.
Should I use the less fancy differece/CorrectValue<  err and skip
a division by 2?


If it is relative, then shouldn't it be relative to the desired length? 
 Why does the computed approximation factor in to the size of the error 
you are looking for?  If you use the average then the amount of error 
that you will "accept" will be larger if your estimate is wronger.  I 
don't think the "wrongness" of your approximation should have any effect 
on your error.  How about this:


(Math.abs(len-leftLen) < err*len)

(noting that err*len can be calculated outside of the loop).


lines 98-101 - somewhere in the confusion moveToImpl became redundant
with moveTo.


I thought I'd leave these in because they're interface methods,
and it's usually not a great idea to call these from private methods. I've
removed them anyway, because you said so and because I suppose
if ever we need to do something to the user input that shouldn't be done
to input coming from private methods in the class (such as the scaling
of the input coordinates done in Renderer) we can always easily
reintroduce them.


I actually thought about the interface concept after I sent that and was 
at peace with them, but I'm also at peace with them being gone. ;-)



That's right. I don't think it's what should happen, but it's what closed
jre does, so I've left it in (with some changes to make it actually
replicate the behaviour of closed java, since it was buggy - the moveTo
was missing). I don't draw anything on the second close in the close,close
case, since that would look horrible with round joins and square caps.
However, the way path2D's are implemented this safeguard will not be
activated since a closePath() following another closePath() is ignored.


Note that a custom shape can send segments in any order that it wants so 
close,close can happen from a custom shape even if Path2D won't do it.



I also now initialize the *dxy *mxy variables in moveTo. This should be an
inconsequential change, but I've put it in just so the state of
Stroker after a moveTo is well defined.


New code looks good (even if we think it's a silly side effect of closed 
JDK to have to implement).



line 368 - does this cause a problem if t==1 because lastSegLen will
now return the wrong value?  Maybe move this into an else clause on the
"t>=1" test?


It does cause a problem. I've fixed it by adding a variable that keeps track
of the length of the last returned segment. The way it was being done was
a dirty hack because it assumed that if the method didn't return in the loop,
done would remain false.


Woohoo!  I was actually bothered by the side-effecting in the old code - 
this is a better approach.



I made one more change in dasher: in somethingTo I removed the long comment
near the end, and I handle the case where phase == dash[idx] immediately.
I do this for consistency with lineTo. The only instances where this makes
a difference is when we have a path that starts with a dash, ends at exactly
the end of a dash, and has a closePath. So something like move(100,0),
line(0,0),line(0,50),line(100,50),line(100,0),close() with a dash pattern 
{60,60}.
In closed jdk (and also in openjdk with my patch), no join would be drawn
at 100,0 on this path with this dash pattern.
Whether that's correct behaviour is arguable, imho. On one hand, if we don't do 
it
, but I think it is because if
it isn't done certain dashed rectangles start looking weird at the top left.


I think it probably should not have a join since we don't join two 
segments if one of the "OFF" lengths is 0.  We should probably only save 
the first dash if it starts in the middle of a dash.


Perhaps the closed JDK is wrong here.


Now, consider a path like 
move(100,0),line(0,0),curve(0,100,100,100,100,0),close()
with a dash pattern of {60,60}. The length of the curve in this is exactly 200 
(I
have not proven this, but it seems clear since the more I increase precision, 
the
closer to 200 the computed length becomes. Also, if you render it with a dash 
pattern
of {10,10} in closed jdk, exactly 10 full dashes are drawn). So, this path has 
exactly the same
length as the above path. However, if this path is rendered with closed jdk a 
join
will be drawn at (100,0). This behaviour is inconsistent with the line only path
For this reason, I think this is a bug in closed jdk since dashing
should depend only on the path's length, not on the kind of segments in a path.


Note that curve length is imperfect so their behavior may differ because 
the above path had exact lengths that didn't rely on fp calculations 
with errors to get it right, but the curved case had a lot of 
computations.  Even

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-22 Thread Jim Graham

Hi Denis,

Interesting results.  Note that sun-jdk7 is already showing some 
regressions so the more intesting results are the old-to-new pisces 
comparisons.  It's interesting to note that non-AA dashed ovals took a 
much bigger hit than AA dashed ovals so we need to see which code is 
fielding those and see what its issue is.


I'd stick with the new algorithm and lets look for ways to make dashed 
curves go faster.  I know there are better "curve length" algorithms we 
could incorporate, but let's get a stake in the ground before we 
consider going back to flattening...


...jim

On 10/22/2010 2:25 PM, Denis Lila wrote:

Hi Jim.


I was going to run these today, but fixing the dashing bug above and
rerunning the tests took a while and it's already 8:30 pm here and I
have to go home. I'll run them tomorrow morning.


I ran the benchmarks. I've attached the options file. I ran benchmarks
of my icedtea installation, closed source java, openjdk7 without the
webrev, and openjdk7 with the webrev.

The results files are at http://icedtea.classpath.org/~dlila/benchResults/
I think the names are pretty self explanatory.

The html comparisons are:
jdk6 vs latest work:
http://icedtea.classpath.org/~dlila/benchResults/JDK6vsLatest_html/Summary_Report.html

closed source vs latest work:
http://icedtea.classpath.org/~dlila/benchResults/SUNvsLatest_html/Summary_Report.html

and most importantly:
previous version of pisces in openjdk7 vs latest work:
http://icedtea.classpath.org/~dlila/benchResults/PrevVsLatest_html/Summary_Report.html

The improvements are significant. Running J2DAnalyzer on all the results files 
with
jdk6Bench.res as the basis produces the following summary:

Summary:
   jdk6Bench:
 Number of tests:  104
 Overall average:  30.7986576862
 Best spread:  0.0% variance
 Worst spread: 10.96% variance
 (Basis for results comparison)

   sunBench:
 Number of tests:  104
 Overall average:  276654.4443479696
 Best spread:  0.25% variance
 Worst spread: 19.28% variance
 Comparison to basis:
   Best result:  6488.34% of basis
   Worst result: 43.74% of basis
   Number of wins:   80
   Number of ties:   2
   Number of losses: 22

   prevPisces:
 Number of tests:  104
 Overall average:  221539.3516605794
 Best spread:  0.08% variance
 Worst spread: 5.54% variance
 Comparison to basis:
   Best result:  350.33% of basis
   Worst result: 55.0% of basis
   Number of wins:   57
   Number of ties:   10
   Number of losses: 37

   latestPisces:
 Number of tests:  104
 Overall average:  226762.64157611743
 Best spread:  0.0% variance
 Worst spread: 3.03% variance
 Comparison to basis:
   Best result:  501.86% of basis
   Worst result: 26.23% of basis
   Number of wins:   72
   Number of ties:   4
   Number of losses: 28


But unfortunately, if you look at the individual test cases in the html
reports there are also some stepbacks, most notably in the dashing of
anything that isn't a straight line. In fact, the results of drawOval
show a 50%-500% improvement in non dashed drawing, and a 50%-25% deterioration
in dashed drawing. I was expecting this, after the binary search algorithm.
I've been thinking it might be better if we just go with the old algorithm
and simply use Dasher.LengthIterator to flatten the curves and feed the lines
to lineTo. Or we could just go with what we have and hope we find a better
algorithm.

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-22 Thread Denis Lila
Hi Jim.

> I was going to run these today, but fixing the dashing bug above and
> rerunning the tests took a while and it's already 8:30 pm here and I
> have to go home. I'll run them tomorrow morning.

I ran the benchmarks. I've attached the options file. I ran benchmarks
of my icedtea installation, closed source java, openjdk7 without the
webrev, and openjdk7 with the webrev.

The results files are at http://icedtea.classpath.org/~dlila/benchResults/
I think the names are pretty self explanatory.

The html comparisons are:
jdk6 vs latest work:
http://icedtea.classpath.org/~dlila/benchResults/JDK6vsLatest_html/Summary_Report.html

closed source vs latest work:
http://icedtea.classpath.org/~dlila/benchResults/SUNvsLatest_html/Summary_Report.html

and most importantly:
previous version of pisces in openjdk7 vs latest work:
http://icedtea.classpath.org/~dlila/benchResults/PrevVsLatest_html/Summary_Report.html

The improvements are significant. Running J2DAnalyzer on all the results files 
with
jdk6Bench.res as the basis produces the following summary:

Summary:
  jdk6Bench: 
Number of tests:  104
Overall average:  30.7986576862
Best spread:  0.0% variance
Worst spread: 10.96% variance
(Basis for results comparison)

  sunBench: 
Number of tests:  104
Overall average:  276654.4443479696
Best spread:  0.25% variance
Worst spread: 19.28% variance
Comparison to basis:
  Best result:  6488.34% of basis
  Worst result: 43.74% of basis
  Number of wins:   80
  Number of ties:   2
  Number of losses: 22

  prevPisces: 
Number of tests:  104
Overall average:  221539.3516605794
Best spread:  0.08% variance
Worst spread: 5.54% variance
Comparison to basis:
  Best result:  350.33% of basis
  Worst result: 55.0% of basis
  Number of wins:   57
  Number of ties:   10
  Number of losses: 37

  latestPisces: 
Number of tests:  104
Overall average:  226762.64157611743
Best spread:  0.0% variance
Worst spread: 3.03% variance
Comparison to basis:
  Best result:  501.86% of basis
  Worst result: 26.23% of basis
  Number of wins:   72
  Number of ties:   4
  Number of losses: 28


But unfortunately, if you look at the individual test cases in the html
reports there are also some stepbacks, most notably in the dashing of
anything that isn't a straight line. In fact, the results of drawOval
show a 50%-500% improvement in non dashed drawing, and a 50%-25% deterioration
in dashed drawing. I was expecting this, after the binary search algorithm.
I've been thinking it might be better if we just go with the old algorithm
and simply use Dasher.LengthIterator to flatten the curves and feed the lines
to lineTo. Or we could just go with what we have and hope we find a better
algorithm.

Regards,
Denis.


piscesBench.opt
Description: Binary data


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-22 Thread Denis Lila
Hi Jim.

>>>  http://icedtea.classpath.org/~dlila/webrevs/noflatten2/webrev/

> lines 279 and 303 - maybe add Helpers.nearZero(v, nulp) for these?

I've introduced this method and am using it in 303. However, I think
line 279 is good as it is, since we're comparing Float.MIN_VALUE and a double.


> line 303 - could use either within or nearZero, same comment about the
> value being computed twice.  Also, aa+cc gets computed later, not sure
> if the compiler is smart enough to deal with that.

I think caching aa+cc would be a bit overkill. However, I've declared the
variables as final and hopefully that will help the compiler to not
recompute them.

> line 405 - I didn't understand why you use "2*err*(len+leftLen)" as
> the termination condition, why not just err?

Because the error is meant to be relative. What I use is supposed to be
equivalent to difference/AverageOfCorrectValueAndApproximation < err,
where, in our case, AverageOfCorrectValueAndApproximation=(len+leafLen)/2,
so that multiplication by 2 should have been a division.
Should I use the less fancy differece/CorrectValue < err and skip
a division by 2?

> lines 200,236 - for the most part, aren't these redundant?

I think so. When I first implemented the bezier round joins they were
necessary, but I'm not sure why. All I know is when I tested it without
them, joins weren't drawn properly. However, I can't reproduce the problem
now, so I've removed them.

> lines 98-101 - somewhere in the confusion moveToImpl became redundant
> with moveTo.

I thought I'd leave these in because they're interface methods,
and it's usually not a great idea to call these from private methods. I've
removed them anyway, because you said so and because I suppose
if ever we need to do something to the user input that shouldn't be done
to input coming from private methods in the class (such as the scaling
of the input coordinates done in Renderer) we can always easily
reintroduce them.

> lines 358-361 - lineToImpl redundant now?
> line 390 - should finish() really be done here?  That would mean that
> moveTo,close would try to emit something and close,close would as
> well. 
>   Is that right?

That's right. I don't think it's what should happen, but it's what closed
jre does, so I've left it in (with some changes to make it actually
replicate the behaviour of closed java, since it was buggy - the moveTo
was missing). I don't draw anything on the second close in the close,close
case, since that would look horrible with round joins and square caps.
However, the way path2D's are implemented this safeguard will not be
activated since a closePath() following another closePath() is ignored.

I also now initialize the *dxy *mxy variables in moveTo. This should be an
inconsequential change, but I've put it in just so the state of
Stroker after a moveTo is well defined.

> line 368 - does this cause a problem if t==1 because lastSegLen will
> now return the wrong value?  Maybe move this into an else clause on the 
> "t>=1" test?

It does cause a problem. I've fixed it by adding a variable that keeps track
of the length of the last returned segment. The way it was being done was
a dirty hack because it assumed that if the method didn't return in the loop,
done would remain false.

I made one more change in dasher: in somethingTo I removed the long comment
near the end, and I handle the case where phase == dash[idx] immediately.
I do this for consistency with lineTo. The only instances where this makes
a difference is when we have a path that starts with a dash, ends at exactly
the end of a dash, and has a closePath. So something like move(100,0),
line(0,0),line(0,50),line(100,50),line(100,0),close() with a dash pattern 
{60,60}.
In closed jdk (and also in openjdk with my patch), no join would be drawn
at 100,0 on this path with this dash pattern.
Whether that's correct behaviour is arguable, imho. On one hand, if we don't do 
it
, but I think it is because if
it isn't done certain dashed rectangles start looking weird at the top left.

Now, consider a path like 
move(100,0),line(0,0),curve(0,100,100,100,100,0),close()
with a dash pattern of {60,60}. The length of the curve in this is exactly 200 
(I
have not proven this, but it seems clear since the more I increase precision, 
the
closer to 200 the computed length becomes. Also, if you render it with a dash 
pattern
of {10,10} in closed jdk, exactly 10 full dashes are drawn). So, this path has 
exactly the same
length as the above path. However, if this path is rendered with closed jdk a 
join
will be drawn at (100,0). This behaviour is inconsistent with the line only path
For this reason, I think this is a bug in closed jdk since dashing
should depend only on the path's length, not on the kind of segments in a path.

Handling the case where phase==dash[idx] immediately fixes this problem.

I also moved the second if statement in the loop of lineTo inside the first if
for symmetry with how this case is handled in somethingTo. T

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-22 Thread Denis Lila
Hi Jim.

> Does closed JDK emit something on moveto,close?  (Or close,close?)

Surprisingly, it does. On moveto,close it draws caps just as if a 0
length line had been drawn. Isn't this a bug? It doesn't make much sense
to me that something should be drawn because of a moveTo.

So, if we want to replicate closed jdk's behaviour, the finish() call there
is correct, but it alone isn't enough. An emitMoveTo needs to be issued,
otherwise the last point of the last drawing command will be connected
to the cap drawn at the last moveTo position.

Regards,
Denis.

- "Jim Graham"  wrote:

> This comment may make more sense if I explain that the condition for 
> when finish() is executed in closePath is the reverse of the condition
> 
> in move and done.  I agree that it should return if prev!=OP_TO, but
> I'm 
> not so sure about doing a finish().
> 
>   ...jim
> 
> On 10/21/2010 6:10 PM, Jim Graham wrote:
> > Hi Denis,
> >
> > I saw something in the latest webrev that reminded me of an earlier
> > comment.
> >
> > On 10/18/2010 2:21 PM, Denis Lila wrote:
> >>> line 389 - The test here is different from closePath. What if
> they
> >>> were both "prev == DRAWING_OP_TO"?
> >>
> >> I am now using prev!=DRAWING_OP_TO (not ==, since it is supposed
> to
> >> execute
> >> finish() if no nonzero length lines have been fed to Stroker yet).
> In
> >> fact
> >> I have removed the "started" variable since it's equivalent to
> >> prev==DRAWING_OP_TO.
> >
> > It looks like closePath still uses a different test than moveTo and
> > pathDone. They all test for DRAWING_OP_TO, but closepath uses !=
> whereas
> > the others use ==. Is that right?
> >
> > ...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-21 Thread Jim Graham

Hi Denis,

This should be the last batch of comments, most of which may require a 
10 second answer.  Most of them could just be ignored as they are minor 
optimizations.  There are only a couple where I think something is "off"...


PiscesRenderingEngine.java:

lines 279 and 303 - maybe add Helpers.nearZero(v, nulp) for these?  For 
one thing, you compute the same values twice - hopefully the compiler 
will notice that.


line 303 - could use either within or nearZero, same comment about the 
value being computed twice.  Also, aa+cc gets computed later, not sure 
if the compiler is smart enough to deal with that.


line 325 - you don't need to clone at.  "outat = at" should be enough.

A note for next revision - you don't need to include the translation in 
outat and inat, it would be enough to use just the abcd part of the 
transform.  Right now, the TransformingPathConsumer implementations tend 
to assume translation, but we could create variants for the 
non-translating cases and maybe save some work.


Dasher.java:

lines 98-101 - somewhere in the confusion moveToImpl became redundant 
with moveTo.


lines 174-175 - lineToImpl now redundant too.

line 368 - does this cause a problem if t==1 because lastSegLen will now 
return the wrong value?  Maybe move this into an else clause on the 
"t>=1" test?


line 405 - I didn't understand why you use "2*err*(len+leftLen)" as the 
termination condition, why not just err?


Stroker.java:

lines 200,236 - for the most part, aren't these redundant?

lines 358-361 - lineToImpl redundant now?

line 390 - should finish() really be done here?  That would mean that 
moveTo,close would try to emit something and close,close would as well. 
 Is that right?


line 800 - technically, this could be [0] and [1] since all points on 
the curve are the same.  ;-)


line 805,810 - abs()?

line 821,824 - you add 1 to nCurves just to subtract 1 later?  Maybe 
rename it to nSplits and skip the +/-1?


...jim

On 10/21/2010 1:52 PM, Jim Graham wrote:

Hi Denis,

I'll be focusing on this later today, just a last proofread.

In the meantime, can you outline the tests that you ran?

Also, have you used J2DBench at all? I know you ran some of your own
benchmarks, but didn't know if you were familiar with this tool:

{OpenJDK}/src/share/demo/java2d/J2DBench/

...jim

On 10/21/2010 12:27 PM, Denis Lila wrote:

Good to push?

http://icedtea.classpath.org/~dlila/webrevs/noflatten2/webrev/


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-21 Thread Jim Graham
This comment may make more sense if I explain that the condition for 
when finish() is executed in closePath is the reverse of the condition 
in move and done.  I agree that it should return if prev!=OP_TO, but I'm 
not so sure about doing a finish().  Does closed JDK emit something on 
moveto,close?  (Or close,close?)


...jim

On 10/21/2010 6:10 PM, Jim Graham wrote:

Hi Denis,

I saw something in the latest webrev that reminded me of an earlier
comment.

On 10/18/2010 2:21 PM, Denis Lila wrote:

line 389 - The test here is different from closePath. What if they
were both "prev == DRAWING_OP_TO"?


I am now using prev!=DRAWING_OP_TO (not ==, since it is supposed to
execute
finish() if no nonzero length lines have been fed to Stroker yet). In
fact
I have removed the "started" variable since it's equivalent to
prev==DRAWING_OP_TO.


It looks like closePath still uses a different test than moveTo and
pathDone. They all test for DRAWING_OP_TO, but closepath uses != whereas
the others use ==. Is that right?

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-21 Thread Jim Graham

Hi Denis,

I saw something in the latest webrev that reminded me of an earlier comment.

On 10/18/2010 2:21 PM, Denis Lila wrote:

line 389 - The test here is different from closePath.  What if they
were both "prev == DRAWING_OP_TO"?


I am now using prev!=DRAWING_OP_TO (not ==, since it is supposed to execute
finish() if no nonzero length lines have been fed to Stroker yet). In fact
I have removed the "started" variable since it's equivalent to 
prev==DRAWING_OP_TO.


It looks like closePath still uses a different test than moveTo and 
pathDone.  They all test for DRAWING_OP_TO, but closepath uses != 
whereas the others use ==.  Is that right?


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-21 Thread Denis Lila
Hi Jim.

> In the meantime, can you outline the tests that you ran?

I ran Java2D without any problems. There's also been an
icedtea bug http://icedtea.classpath.org/bugzilla/show_bug.cgi?id=450
related to a look and feel (tinylaf: 
http://www.muntjak.de/hans/java/tinylaf/tinylaf-1_4_0.zip
to run a demo, download, extract and run tinycp.jar) that wasn't being
painted properly. Now it looks good (I think there's no LCD text antialiasing, 
but
that's a different story).

I also wrote rendered at least 5 random cubic curves. I didn't
inspect these too closely - this test was just to find any obvious
errors. There weren't any.

Most importantly, I also ran this test suite:
http://icedtea.classpath.org/hg/gfx-test/

It generates a lot of images using a reference java implementation and
an implementation to be tested. I used closed source java as a reference,
and I ran it using a fresh checkout of the 2d branch of openjdk7:
http://icedtea.classpath.org/~dlila/prevPiscesSungfx/results/

and a fresh checkout of the 2d branch with my patch applied:
http://icedtea.classpath.org/~dlila/latestPiscesSungfx/

As you can see, the number of images that are the same as the reference
implementation has increased subtantially. It has also increased
consistenly across all categories.
I've gone through most of the images, and the ones that are different
aren't dramatically different. The differences are barely noticeable.
The improvement is even greater when compared to openjdk6 which produces
some pretty scary test results, especially in the scaled rectangles
category (but I haven't uploaded these tests so you'll just have to trust
me on this ;-)  )

NOTE: the webrev has changed slightly since the last e-mail I sent. The 
gfx test suite revealed a bug in the drawing of dashed rectangles, so
to fix it I have changed a line in Dasher.closePath from "if(needsMoveTo) {"
to "if(needsMoveTo || !dashOn) {"

> Also, have you used J2DBench at all?  I know you ran some of your own
> benchmarks, but didn't know if you were familiar with this tool:
> 
>   {OpenJDK}/src/share/demo/java2d/J2DBench/

I was going to run these today, but fixing the dashing bug above and
rerunning the tests took a while and it's already 8:30 pm here and I have to
go home. I'll run them tomorrow morning.

Regards,
Denis.

- "Jim Graham"  wrote:

> Hi Denis,
> 
> I'll be focusing on this later today, just a last proofread.
> 

> 
>   ...jim
> 
> On 10/21/2010 12:27 PM, Denis Lila wrote:
> > Good to push?
> >
> > http://icedtea.classpath.org/~dlila/webrevs/noflatten2/webrev/


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-21 Thread Jim Graham

Hi Denis,

I'll be focusing on this later today, just a last proofread.

In the meantime, can you outline the tests that you ran?

Also, have you used J2DBench at all?  I know you ran some of your own 
benchmarks, but didn't know if you were familiar with this tool:


{OpenJDK}/src/share/demo/java2d/J2DBench/

...jim

On 10/21/2010 12:27 PM, Denis Lila wrote:

Good to push?

http://icedtea.classpath.org/~dlila/webrevs/noflatten2/webrev/


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-21 Thread Denis Lila
Hi Jim.

> When one talks about curves and being parallel, my mind 
> tends to think of the tangents of the curves being parallel and
> tangents are directed by the first derivative.

That's what I was trying to express. The tangents of curves A and B
at t are parallel if and only if there's some c such that A'(t) = c*B'(t),
which is what I defined || to mean, although perhaps I didn't make this
clear enough.

> Also, if you are going to use your definition of "vector" then
> parallel is an odd term to use 

I agree. I was never comfortable with using "parallel" but I couldn't
find a better word. How about "aligned"?

Anyway, I've changed the comments to something that hopefully will
avoid this sort of confusion in the future.

Other than this, I think I've followed all your other suggestions
(putting some of the degenerate curve handling in separate functions,
fixing the style issues, removing unused methods, etc).
I also fixed a bug in Renderer.edgeSetCurY where I was using
edges[idx+MINY] instead of edges[idx+CURY] (which produced visible
artifacts on nearly horizontal lines). I am also now using your
TransformingPathConsumer2D class, which simplified things quite a bit
where ever it was used.

Good to push?

http://icedtea.classpath.org/~dlila/webrevs/noflatten2/webrev/

Regards,
Denis.

- "Jim Graham"  wrote:

> OK, I can see how your terminology works now, but it seems odd to me. 
> I 
> never consider re-expressing the coordinates on a curve as a vector
> and 
> basing geometric properties on those constructed vectors.  I either 
> consider the points on the curve, or its tangent or its normal - none
> of 
> which is the value you are expressing.  You are, essentially,
> operating 
> on tangent vectors, but you aren't calling them that, you are calling
> 
> them something like "the vector of the derivative" which is a relative
> 
> (direction only) version of the tangent vector (which has location and
> 
> direction).  
> for values that originate from the same point 
> (points considered as a vector are taken to originate from 0,0) -
> really 
> you want those "vectors" to be collinear, not (just) parallel.
> 
> So, either || means "the coordinates of the curves expressed as
> vectors 
> are collinear" or it means "the curves (i.e. the tangents of the curve
> 
> at the indicated point) are parallel".  Saying "vector I() is parallel
> 
> to vector B()" didn't really have meaning to me based on the above
> biases.
> 
> So, I get your comment now and all of the math makes sense, but the 
> terminology seemed foreign to me...
> 
>   ...jim
> 
> On 10/20/10 10:48 AM, Denis Lila wrote:
> >> Also, how is A(t) and B(t) are parallel not the same as "the curves
> A
> >> and B are parallel at t"?
> >
> > Well, suppose A and B are lines with endpoints (0,0), (2,0) for A
> > and (0,1),(2,1) for B. Obviously, for all t, A and B are parallel at
> t.
> > However let t = 0.5. Then A(t) = (1,0) and B(t) = (1, 1). The
> vectors
> > (1,0) and (1,1) are not parallel, so saying A(t) || B(t) is the
> same
> > as saying that there exists c such that (1,0) = c*(1,1), which isn't
> true.
> >
> > However, A'(t)=(2,0) and B'(t)=(2,0), and the vectors
> > (2,0) and (2,0) are parallel.
> >
> > Does this make more sense?
> >
> > Regards,
> > Denis.
> >
> > - "Jim Graham"  wrote:
> >
> >> On 10/20/10 7:54 AM, Denis Lila wrote:
>  In #2, you have a bunch of "I'() || B'()" which I read as "the
> >> slope
>  of the derivative (i.e. acceleration) is equal", don't you
> really
> >> mean
>  "I() || B()" which would mean the original curves should be
> >> parallel?
>  Otherwise you could say "I'() == B'()", but I think you want to
> >> show
>  parallels because that shows how you can use the dxy1,dxy4
> values
> >> as
>  the parallel equivalents.
> >>>
> >>> Not really. I've updated the comment explaining what || does, and
> >>> it should be clearer now. Basically, A(t) || B(t) means that
> >> vectors
> >>> A(t) and B(t) are parallel (i.e. A(t) = c*B(t), for some nonzero
> >> t),
> >>> not that curves A and B are parallel at t.
> >>
> >> I'm not sure we are on the same page here.
> >>
> >> I'() is usually the symbol indicating the "derivative" of I().  My
> >> issue
> >> is not with the || operator, but with the fact that you are
> applying
> >> it
> >> to the I'() instead of I().
> >>
> >> Also, A(t) = c*B(t) is always true for all A and B and all t if
> you
> >> take
> >> a sample in isolation.  Parallel means something like "A(t) =
> c*B(t)
> >> with the same value of c for some interval around t", not that the
> >> values at t can be expressed as a multiple.
> >>
> >> Again, I'() || B'() says to me that the derivative curves are
> >> parallel,
> >> not that the original curves are parallel...
> >>
> >>...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-21 Thread Denis Lila
Hello Jim.

> So either it is always on the right side, always on the wrong side
> (i.e. just reverse the rotation in the math), or always on the right/wrong 
> side depending on the CWness of the join angle - which would be 
> reflected in rev...  No?

That's a good point. I've changed the test to simply if(rev) because
the normal we compute is on the left of the original vector
(omx,omy)-(mx,my). So, if the arc was counter clockwise no adjustment
was needed.

Regards,
Denis.

- "Jim Graham"  wrote:

> Right, but it seemed to me that if omxy was the "from" vector and mxy
> 
> was the "to" vector, that the computed mmxy should always be
> predictably 
> on the same side of it, no?  If it was on the wrong side then it 
> wouldn't be a random occurence, it must be related to the input data.
> 
>   ...jim
> 
> On 10/20/10 10:29 AM, Denis Lila wrote:
> >> Cool, but above I was also asking the same question about line
> 231,
> >> and you provided a lot of information about line 231 (and a test to
> verify
> >> it), but didn't answer if the test in line 231 also tracks rev the
> >> same way...?
> >
> > Oh, no, line 231 isn't mean to be related to rev at all. It just
> checks
> > to see on which side of the (omx,omy),(mx,my) chord the computed
> (mmx, mmy)
> > is.
> >
> > Regards,
> > Denis.
> >
> > - "Jim Graham"  wrote:
> >
> >> Hi Denis,
> >>
> >> One clarification:
> >>
> >> On 10/20/10 7:11 AM, Denis Lila wrote:
>  When would the isCW test trigger?  Does it track "rev"?  What
> >> happens
>  at 180 degrees (is that test reliable for the randomization that
> >> might
>  happen when omxy are directly opposite mxy)?
> >>>
> >>> isCw is used for computing the arc bisector by testing whether
> the
> >> computed
> >>> point is on the side it should be (and multiplying by -1 if not),
> it
> >> is used
> >>> to compute the sign of cv in drawBezApproxForArc, and for
> computing
> >> rev.
> >>>
>  The only reason I ask is
>  because I think the sign of mmxy is probably controllable by
>  understanding the input conditions, but this test should be safe
>  (modulo if it really works at 180 degrees).  If it has failure
> >> modes at 180
>  degrees then reworking the math to produce the right sign in the
> >> first
>  place may be more robust for that case.  A test for this is to
> >> render
>  "(0,0) ->   (100,0) ->   (0,0)" with round caps and then rotate
> it
> >> through
>  360 degrees and see if the round caps invert at various angles.
> >>>
> >>> I already did that. I drew 100 lines like the one you describe. I
> >> attached
> >>> the results. It never fails. It is still possible that there
> could
> >> be some
> >>> case where it fails, but this does prove that such a case would
> be
> >> very rare.
> >>>
>  Also, line 256 - does that track "rev"?
> >>>
> >>> It does. I changed the test to if(rev).
> >>
> >>...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-20 Thread Jim Graham
OK, I can see how your terminology works now, but it seems odd to me.  I 
never consider re-expressing the coordinates on a curve as a vector and 
basing geometric properties on those constructed vectors.  I either 
consider the points on the curve, or its tangent or its normal - none of 
which is the value you are expressing.  You are, essentially, operating 
on tangent vectors, but you aren't calling them that, you are calling 
them something like "the vector of the derivative" which is a relative 
(direction only) version of the tangent vector (which has location and 
direction).  When one talks about curves and being parallel, my mind 
tends to think of the tangents of the curves being parallel and tangents 
are directed by the first derivative.


Also, if you are going to use your definition of "vector" then parallel 
is an odd term to use for values that originate from the same point 
(points considered as a vector are taken to originate from 0,0) - really 
you want those "vectors" to be collinear, not (just) parallel.


So, either || means "the coordinates of the curves expressed as vectors 
are collinear" or it means "the curves (i.e. the tangents of the curve 
at the indicated point) are parallel".  Saying "vector I() is parallel 
to vector B()" didn't really have meaning to me based on the above biases.


So, I get your comment now and all of the math makes sense, but the 
terminology seemed foreign to me...


...jim

On 10/20/10 10:48 AM, Denis Lila wrote:

Also, how is A(t) and B(t) are parallel not the same as "the curves A
and B are parallel at t"?


Well, suppose A and B are lines with endpoints (0,0), (2,0) for A
and (0,1),(2,1) for B. Obviously, for all t, A and B are parallel at t.
However let t = 0.5. Then A(t) = (1,0) and B(t) = (1, 1). The vectors
(1,0) and (1,1) are not parallel, so saying A(t) || B(t) is the same
as saying that there exists c such that (1,0) = c*(1,1), which isn't true.

However, A'(t)=(2,0) and B'(t)=(2,0), and the vectors
(2,0) and (2,0) are parallel.

Does this make more sense?

Regards,
Denis.

- "Jim Graham"  wrote:


On 10/20/10 7:54 AM, Denis Lila wrote:

In #2, you have a bunch of "I'() || B'()" which I read as "the

slope

of the derivative (i.e. acceleration) is equal", don't you really

mean

"I() || B()" which would mean the original curves should be

parallel?

Otherwise you could say "I'() == B'()", but I think you want to

show

parallels because that shows how you can use the dxy1,dxy4 values

as

the parallel equivalents.


Not really. I've updated the comment explaining what || does, and
it should be clearer now. Basically, A(t) || B(t) means that

vectors

A(t) and B(t) are parallel (i.e. A(t) = c*B(t), for some nonzero

t),

not that curves A and B are parallel at t.


I'm not sure we are on the same page here.

I'() is usually the symbol indicating the "derivative" of I().  My
issue
is not with the || operator, but with the fact that you are applying
it
to the I'() instead of I().

Also, A(t) = c*B(t) is always true for all A and B and all t if you
take
a sample in isolation.  Parallel means something like "A(t) = c*B(t)
with the same value of c for some interval around t", not that the
values at t can be expressed as a multiple.

Again, I'() || B'() says to me that the derivative curves are
parallel,
not that the original curves are parallel...

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-20 Thread Jim Graham
Right, but it seemed to me that if omxy was the "from" vector and mxy 
was the "to" vector, that the computed mmxy should always be predictably 
on the same side of it, no?  If it was on the wrong side then it 
wouldn't be a random occurence, it must be related to the input data. 
So either it is always on the right side, always on the wrong side (i.e. 
just reverse the rotation in the math), or always on the right/wrong 
side depending on the CWness of the join angle - which would be 
reflected in rev...  No?


...jim

On 10/20/10 10:29 AM, Denis Lila wrote:

Cool, but above I was also asking the same question about line 231,
and you provided a lot of information about line 231 (and a test to verify
it), but didn't answer if the test in line 231 also tracks rev the
same way...?


Oh, no, line 231 isn't mean to be related to rev at all. It just checks
to see on which side of the (omx,omy),(mx,my) chord the computed (mmx, mmy)
is.

Regards,
Denis.

- "Jim Graham"  wrote:


Hi Denis,

One clarification:

On 10/20/10 7:11 AM, Denis Lila wrote:

When would the isCW test trigger?  Does it track "rev"?  What

happens

at 180 degrees (is that test reliable for the randomization that

might

happen when omxy are directly opposite mxy)?


isCw is used for computing the arc bisector by testing whether the

computed

point is on the side it should be (and multiplying by -1 if not), it

is used

to compute the sign of cv in drawBezApproxForArc, and for computing

rev.



The only reason I ask is
because I think the sign of mmxy is probably controllable by
understanding the input conditions, but this test should be safe
(modulo if it really works at 180 degrees).  If it has failure

modes at 180

degrees then reworking the math to produce the right sign in the

first

place may be more robust for that case.  A test for this is to

render

"(0,0) ->   (100,0) ->   (0,0)" with round caps and then rotate it

through

360 degrees and see if the round caps invert at various angles.


I already did that. I drew 100 lines like the one you describe. I

attached

the results. It never fails. It is still possible that there could

be some

case where it fails, but this does prove that such a case would be

very rare.



Also, line 256 - does that track "rev"?


It does. I changed the test to if(rev).


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-20 Thread Denis Lila
> Also, how is A(t) and B(t) are parallel not the same as "the curves A
> and B are parallel at t"?

Well, suppose A and B are lines with endpoints (0,0), (2,0) for A
and (0,1),(2,1) for B. Obviously, for all t, A and B are parallel at t.
However let t = 0.5. Then A(t) = (1,0) and B(t) = (1, 1). The vectors
(1,0) and (1,1) are not parallel, so saying A(t) || B(t) is the same
as saying that there exists c such that (1,0) = c*(1,1), which isn't true.

However, A'(t)=(2,0) and B'(t)=(2,0), and the vectors 
(2,0) and (2,0) are parallel.

Does this make more sense?

Regards,
Denis.

- "Jim Graham"  wrote:

> On 10/20/10 7:54 AM, Denis Lila wrote:
> >> In #2, you have a bunch of "I'() || B'()" which I read as "the
> slope
> >> of the derivative (i.e. acceleration) is equal", don't you really
> mean
> >> "I() || B()" which would mean the original curves should be
> parallel?
> >> Otherwise you could say "I'() == B'()", but I think you want to
> show
> >> parallels because that shows how you can use the dxy1,dxy4 values
> as
> >> the parallel equivalents.
> >
> > Not really. I've updated the comment explaining what || does, and
> > it should be clearer now. Basically, A(t) || B(t) means that
> vectors
> > A(t) and B(t) are parallel (i.e. A(t) = c*B(t), for some nonzero
> t),
> > not that curves A and B are parallel at t.
> 
> I'm not sure we are on the same page here.
> 
> I'() is usually the symbol indicating the "derivative" of I().  My
> issue 
> is not with the || operator, but with the fact that you are applying
> it 
> to the I'() instead of I().
> 
> Also, A(t) = c*B(t) is always true for all A and B and all t if you
> take 
> a sample in isolation.  Parallel means something like "A(t) = c*B(t) 
> with the same value of c for some interval around t", not that the 
> values at t can be expressed as a multiple.
> 
> Again, I'() || B'() says to me that the derivative curves are
> parallel, 
> not that the original curves are parallel...
> 
>   ...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-20 Thread Denis Lila
> Cool, but above I was also asking the same question about line 231,
> and you provided a lot of information about line 231 (and a test to verify
> it), but didn't answer if the test in line 231 also tracks rev the
> same way...?

Oh, no, line 231 isn't mean to be related to rev at all. It just checks
to see on which side of the (omx,omy),(mx,my) chord the computed (mmx, mmy)
is.

Regards,
Denis.

- "Jim Graham"  wrote:

> Hi Denis,
> 
> One clarification:
> 
> On 10/20/10 7:11 AM, Denis Lila wrote:
> >> When would the isCW test trigger?  Does it track "rev"?  What
> happens
> >> at 180 degrees (is that test reliable for the randomization that
> might
> >> happen when omxy are directly opposite mxy)?
> >
> > isCw is used for computing the arc bisector by testing whether the
> computed
> > point is on the side it should be (and multiplying by -1 if not), it
> is used
> > to compute the sign of cv in drawBezApproxForArc, and for computing
> rev.
> >
> >> The only reason I ask is
> >> because I think the sign of mmxy is probably controllable by
> >> understanding the input conditions, but this test should be safe
> >> (modulo if it really works at 180 degrees).  If it has failure
> modes at 180
> >> degrees then reworking the math to produce the right sign in the
> first
> >> place may be more robust for that case.  A test for this is to
> render
> >> "(0,0) ->  (100,0) ->  (0,0)" with round caps and then rotate it
> through
> >> 360 degrees and see if the round caps invert at various angles.
> >
> > I already did that. I drew 100 lines like the one you describe. I
> attached
> > the results. It never fails. It is still possible that there could
> be some
> > case where it fails, but this does prove that such a case would be
> very rare.
> >
> >> Also, line 256 - does that track "rev"?
> >
> > It does. I changed the test to if(rev).
> 
>   ...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-20 Thread Jim Graham

On 10/20/10 7:54 AM, Denis Lila wrote:

In #2, you have a bunch of "I'() || B'()" which I read as "the slope
of the derivative (i.e. acceleration) is equal", don't you really mean
"I() || B()" which would mean the original curves should be parallel?
Otherwise you could say "I'() == B'()", but I think you want to show
parallels because that shows how you can use the dxy1,dxy4 values as
the parallel equivalents.


Not really. I've updated the comment explaining what || does, and
it should be clearer now. Basically, A(t) || B(t) means that vectors
A(t) and B(t) are parallel (i.e. A(t) = c*B(t), for some nonzero t),
not that curves A and B are parallel at t.


I'm not sure we are on the same page here.

I'() is usually the symbol indicating the "derivative" of I().  My issue 
is not with the || operator, but with the fact that you are applying it 
to the I'() instead of I().


Also, how is A(t) and B(t) are parallel not the same as "the curves A 
and B are parallel at t"?


Also, A(t) = c*B(t) is always true for all A and B and all t if you take 
a sample in isolation.  Parallel means something like "A(t) = c*B(t) 
with the same value of c for some interval around t", not that the 
values at t can be expressed as a multiple.


Again, I'() || B'() says to me that the derivative curves are parallel, 
not that the original curves are parallel...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-20 Thread Jim Graham

Hi Denis,

One clarification:

On 10/20/10 7:11 AM, Denis Lila wrote:

When would the isCW test trigger?  Does it track "rev"?  What happens
at 180 degrees (is that test reliable for the randomization that might
happen when omxy are directly opposite mxy)?


isCw is used for computing the arc bisector by testing whether the computed
point is on the side it should be (and multiplying by -1 if not), it is used
to compute the sign of cv in drawBezApproxForArc, and for computing rev.


The only reason I ask is
because I think the sign of mmxy is probably controllable by
understanding the input conditions, but this test should be safe
(modulo if it really works at 180 degrees).  If it has failure modes at 180
degrees then reworking the math to produce the right sign in the first
place may be more robust for that case.  A test for this is to render
"(0,0) ->  (100,0) ->  (0,0)" with round caps and then rotate it through
360 degrees and see if the round caps invert at various angles.


I already did that. I drew 100 lines like the one you describe. I attached
the results. It never fails. It is still possible that there could be some
case where it fails, but this does prove that such a case would be very rare.


Also, line 256 - does that track "rev"?


It does. I changed the test to if(rev).


Cool, but above I was also asking the same question about line 231, and 
you provided a lot of information about line 231 (and a test to verify 
it), but didn't answer if the test in line 231 also tracks rev the same 
way...?


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-20 Thread Denis Lila
Hi Jim.

> I wasn't sure why you isolated that term out there instead of just 
> grouping it with the rest of the numerator - is there a danger of 
> overflow if you multiply it before you do the division?  If so, then 
> that is fine since it doesn't actually affect the number of fp ops so
> it should be the same performance.

I'm not sure if there's a danger of overflow, but the numbers there do
tend to be large, so I wanted to be safe.

> In #2, you have a bunch of "I'() || B'()" which I read as "the slope
> of the derivative (i.e. acceleration) is equal", don't you really mean
> "I() || B()" which would mean the original curves should be parallel? 
> Otherwise you could say "I'() == B'()", but I think you want to show 
> parallels because that shows how you can use the dxy1,dxy4 values as
> the parallel equivalents.

Not really. I've updated the comment explaining what || does, and
it should be clearer now. Basically, A(t) || B(t) means that vectors
A(t) and B(t) are parallel (i.e. A(t) = c*B(t), for some nonzero t),
not that curves A and B are parallel at t.

> so it works out the same either way.  Fun...

Yeah - if one is consistent with one's definitions and if the algebra
is followed mechanically the signs take care of themselves.

> No, the existing stuff is clean and works fine so let's leave it - for
> now at the very least...

Sure. I just meant that when I have some free time in the future I'll
implement this other idea just to satisfy my curiosity. I doubt it
will work better than what we have though (or work at all).

Regards,
Denis.


  1   2   >