Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2011-01-05 Thread Denis Lila
Hi Jim.

 What about logic like this:
 
 boolean checkRoots = false;
 if (D  0) {
 // 3 solution form is possible, so use it
 checkRoots = (D  -TINY); // Check them if we were borderline
 // compute 3 roots as before
 } else {
 double u = ...;
 double v = ...;
 res[0] = u+v; // should be 2*u if D is near zero
 if (u close to v) { // Will be true for D near zero
 res[1] = -res[0]/2; // should be -u if D is near zero
 checkRoots = true; // Check them if we were borderline
 // Note that q=0 case ends up here as well...
 }
 }
 if (checkRoots) {
 if (num  2  (res[2] == res[1] || res[2] == res[0]) {
 num--;
 }
 if (num  1  res[1] == res[0]) {
 res[1] = res[--num]; // Copies res[2] to res[1] if needed
 }
 for (int i = num-1; i = 0; i--) {
 res[i] = refine(res[i]);
 for (int j = i+1; j  num; j++) {
 if (res[i] == res[j]) {
 res[i] = res[--num];
 break;
 }
 }
 }
 }

I have two concerns about this.

1. How would refine() be implemented? Would we do something like in
findZero?
2. While I have no problem with your suggestion, it won't help us in
cases where there's a root with a multiplicity of 2 which is not found.
This was a large part of the motivation for this work. What should we
do in this case? The only solution I can see is to find the roots of
the derivative and evaluate the cubic at them, because when a root
has multiplicity  1, the polynomial's derivative also has a root
there. This is what I'm currently doing.

Should we go ahead and push the pisces changes? The cubic solver
in pisces is good enough for what we're using it.

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2011-01-05 Thread Denis Lila
Hi Jim.

 If you want to include these rendering tests as extra verification
 along with your other changes, then that is fine.

Ok, thanks. So, I'll update the performance webrev to include them.

 Also, I think we might have a script that forceably checks the value
 of the @bug tag and ensures that it is a legal bug database number, so
 using a RedHat bug number won't work very well. Is there an existing
 bug that this could be tagged with?

Yeah, this is a problem. I tried finding an existing sun bug for this
issue, but I couldn't, and we can't file one because it's already
fixed. It's just one test though, so we could simply leave it out.

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2011-01-04 Thread Denis Lila
Hi Jim.

 The test as it is has a test case (I just chose random numbers to
 check
 and got lucky - d'oh!) that generates 1 solution from the new code
 even
 though the equation had 2 distinct solutions that weren't even near
 each
 other...

I figured out why this happens. It's because of cancellation in the
computation of D (two large numbers are subtracted and the result is
supposed to be 0 or close to 0, but it's about 1e-7, which wasn't
enough to pass the iszero test). I've been working on this and I
came up with a couple of different ways. They are in the attached
file (it's a modified version of the file your CubicSolve.java).

The first thing I did was to modify solveCubicOld. I tried to get
a bit fancy and although I think I fixed the problems it had, the
end result is ugly, complicated and it has small problems, like
returning 3 very close roots when there should only be one.

The other solution is to just check if the roots of the derivative
are also roots of the cubic polynomial if only 1 root was computed
by the closed form algorithm. This doesn't have the numerical
accuracy of the first way (which used bisectRoots when things went wrong)
but it's much faster, doesn't have the multiple roots problem, and it's
much simpler. I called your trySolve function on a few hundred
polynomials with random roots in [-10, 10] and it never finds fewer
roots than there actually are. Sometimes it finds 3 roots when there are
only 2, but I don't think this is a huge problem.

I've attached what I have so far.

Regards,
Denis.

- Original Message -
 Hi Denis,
 
 I'm attaching a test program I wrote that compares the old and new
 algorithms.
 
 Obviously the old one missed a bunch of solutions because it
 classified
 all solutions as 1 or 3, but the new one also sometimes misses a
 solution. You might want to turn this into an automated test for the
 bug (and maybe use it as a stress test with a random number
 generator).
 
 I think one problem might be that you use is close to zero to check
 if
 you should use special processing. I think any tests which say do it
 this way and get fewer roots should be conservative and if we are on
 the borderline and we can do the code that generates more solutions
 then
 we should generate more and them maybe refine the roots and eliminate
 duplicates. That way we can be (more) sure not to leave any roots
 unsolved.
 

 
 ...jim
import java.awt.geom.QuadCurve2D;
import java.util.Arrays;
import java.util.Random;

import static java.lang.Math.abs;
import static java.lang.Math.max;
import static java.lang.Math.ulp;

public class CubicSolver {
public static int solveCubicOld(double eqn[], double res[]) {
if (res == eqn) {
// Copy the eqn so that we don't clobber it with the
// roots.
eqn = new double[4];
System.arraycopy(res, 0, eqn, 0, 4);
}

// From Numerical Recipes, 5.6, Quadratic and Cubic Equations
double d = eqn[3];
if (d == 0.0) {
// The cubic has degenerated to quadratic (or line or ...).
return QuadCurve2D.solveQuadratic(eqn, res);
}
double a = eqn[2] / d;
double b = eqn[1] / d;
double c = eqn[0] / d;
int roots = 0;
double Q = (a * a - 3.0 * b) / 9.0;
double R = (2.0 * a * a * a - 9.0 * a * b + 27.0 * c) / 54.0;
double R2 = R * R;
double Q3 = Q * Q * Q;
a = a / 3.0;
if (R2  Q3) {
double theta = Math.acos(R / Math.sqrt(Q3));
Q = -2.0 * Math.sqrt(Q);

res[roots++] = Q * Math.cos(theta / 3.0) - a;
res[roots++] = Q * Math.cos((theta + Math.PI * 2.0)/ 3.0) - a;
res[roots++] = Q * Math.cos((theta - Math.PI * 2.0)/ 3.0) - a;
} else {
boolean neg = (R  0.0);
double S = Math.sqrt(R2 - Q3);
if (neg) {
R = -R;
}
double A = Math.pow(R + S, 1.0 / 3.0);
if (!neg) {
A = -A;
}
double B = (A == 0.0) ? 0.0 : (Q / A);
res[roots++] = (A + B) - a;
}

	// we need it to have length 4. We will put the roots of the derivative
	// in deriv[1] and deriv[2]
	final double[] deriv = {eqn[1], 2*eqn[2], 3*eqn[3], 0};
int critCount = QuadCurve2D.solveQuadratic(deriv, deriv);
Arrays.sort(deriv, 0, critCount);
Arrays.sort(res, 0, roots);
// Even if there are fewer than 2 roots, this won't cause problems.
deriv[2] = deriv[1];
deriv[1] = deriv[0];
// The roots of any polynomial must lie in [-M, M] where M = 1 + (max{i=0,n-1}abs(ai))/abs(an)
// http://en.wikipedia.org/wiki/Sturm%27s_theorem#Applications
// Wikipedia says this result is due to Cauchy. There's no proof in the link,
// but I proved it myself (it's a bit long to include here).
double M = 1 + max(max(abs(a), abs(b)), 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2011-01-04 Thread Jim Graham

Hi Denis,

What about logic like this:

boolean checkRoots = false;
if (D  0) {
// 3 solution form is possible, so use it
checkRoots = (D  -TINY);  // Check them if we were borderline
// compute 3 roots as before
} else {
double u = ...;
double v = ...;
res[0] = u+v; // should be 2*u if D is near zero
if (u close to v) {// Will be true for D near zero
res[1] = -res[0]/2;  // should be -u if D is near zero
checkRoots = true;  // Check them if we were borderline
// Note that q=0 case ends up here as well...
}
}
if (checkRoots) {
if (num  2  (res[2] == res[1] || res[2] == res[0]) {
num--;
}
if (num  1  res[1] == res[0]) {
res[1] = res[--num];  // Copies res[2] to res[1] if needed
}
for (int i = num-1; i = 0; i--) {
res[i] = refine(res[i]);
for (int j = i+1; j  num; j++) {
if (res[i] == res[j]) {
res[i] = res[--num];
break;
}
}
}
}

Note that we lose the optimization of calculating just 2*u and -u for 
the 2 root case, but that only happened in rare circumstances.  Also, if 
D is near zero and negative, then we generate 3 roots using 
transcendentals and potentially refine one away, but that should also be 
an uncommon situation and there but for the grace of being a tiny 
negative number would we have gone anyway so I think it is OK to take 
the long way to the answer.


Also, one could argue that if we used the transcendentals to calculate 
the 3 roots, it couldn't hurt to refine the answers anyway.  The other 
solutions should have higher precision, but the transcendental results 
will be much less accurate.


Finally, this lacks the refine them anyway if any of them are near 0 or 
1 rule - the original only did that if the transcendentals were used, 
but it would be nice to do that for any of the cases.  It might make 
sense to have a variant that takes a boolean indicating whether to 
ensure higher accuracy around 0 and 1, but that would require an API 
change request...


...jim

On 1/4/11 2:02 PM, Denis Lila wrote:

Hi Jim.


The test as it is has a test case (I just chose random numbers to
check
and got lucky - d'oh!) that generates 1 solution from the new code
even
though the equation had 2 distinct solutions that weren't even near
each
other...


I figured out why this happens. It's because of cancellation in the
computation of D (two large numbers are subtracted and the result is
supposed to be 0 or close to 0, but it's about 1e-7, which wasn't
enough to pass the iszero test). I've been working on this and I
came up with a couple of different ways. They are in the attached
file (it's a modified version of the file your CubicSolve.java).

The first thing I did was to modify solveCubicOld. I tried to get
a bit fancy and although I think I fixed the problems it had, the
end result is ugly, complicated and it has small problems, like
returning 3 very close roots when there should only be one.

The other solution is to just check if the roots of the derivative
are also roots of the cubic polynomial if only 1 root was computed
by the closed form algorithm. This doesn't have the numerical
accuracy of the first way (which used bisectRoots when things went wrong)
but it's much faster, doesn't have the multiple roots problem, and it's
much simpler. I called your trySolve function on a few hundred
polynomials with random roots in [-10, 10] and it never finds fewer
roots than there actually are. Sometimes it finds 3 roots when there are
only 2, but I don't think this is a huge problem.

I've attached what I have so far.

Regards,
Denis.

- Original Message -

Hi Denis,

I'm attaching a test program I wrote that compares the old and new
algorithms.

Obviously the old one missed a bunch of solutions because it
classified
all solutions as 1 or 3, but the new one also sometimes misses a
solution. You might want to turn this into an automated test for the
bug (and maybe use it as a stress test with a random number
generator).

I think one problem might be that you use is close to zero to check
if
you should use special processing. I think any tests which say do it
this way and get fewer roots should be conservative and if we are on
the borderline and we can do the code that generates more solutions
then
we should generate more and them maybe refine the roots and eliminate
duplicates. That way we can be (more) sure not to leave any roots
unsolved.





...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-28 Thread Jim Graham

Hi Denis,

I'm attaching a test program I wrote that compares the old and new 
algorithms.


Obviously the old one missed a bunch of solutions because it classified 
all solutions as 1 or 3, but the new one also sometimes misses a 
solution.  You might want to turn this into an automated test for the 
bug (and maybe use it as a stress test with a random number generator).


I think one problem might be that you use is close to zero to check if 
you should use special processing.  I think any tests which say do it 
this way and get fewer roots should be conservative and if we are on 
the borderline and we can do the code that generates more solutions then 
we should generate more and them maybe refine the roots and eliminate 
duplicates.  That way we can be (more) sure not to leave any roots unsolved.


The test as it is has a test case (I just chose random numbers to check 
and got lucky - d'oh!) that generates 1 solution from the new code even 
though the equation had 2 distinct solutions that weren't even near each 
other...


...jim



CubicSolver.java
Description: java/


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-24 Thread Denis Lila
Hi Jim.

 Unfortunately, this means that the names here
 and the values assigned to them and the comment above them conflict.
 If the variables could be named p/3 and q/2 then all would be clear,
 but I don't know how to do that naming very easily. Perhaps the
 comment could be simply reworded:
 
 // substitute blah blah blah
 // x^3 + Px + Q = 0
 // Since we actually need P/3 and Q/2 for all of the
 // calculations that follow, we will calculate
 // p = P/3
 // q = Q/2
 // instead and use those values for simplicity of the code.

Good point. Done.

 Line 1105 - single quotes in comments freaks out my version of
 gnuemacs.
 I usually try to avoid them, except in pairs, but there isn't a better
 way to word this comment. :-(

We could always go with the method of Cardano, although
that doesn't sound good at all. Or we could use a backtick instead of
the single quote - it looks similar enough.

 Lines 1157-1163 - the old code used to copy the eqn before it got
 clobbered with roots. Here it is too late. You probably need to move
 this code up near line 1135 before the 3 roots are stuffed into the
 res array. (Did you test the eqn==res case?)

You're right. I'm sorry about this.

 I noticed that the Casus irreducibilis case isn't in Cordano's
 method.
 He only finds roots for the 1 and 2 root case and punts for 3 roots.
 So, this is someone else's method. It would be nice to figure out who
 or what and list a reference, even though the Graphics Gems and the
 old
 code didn't. The closest reference I can find is unattributed on
 Wikipedia, but you could include it in a comment for reference:
 
 http://en.wikipedia.org/wiki/Cubic_function#Trigonometric_.28and_hyperbolic.29_method

Done.

 Line 1133 - I don't understand why that term has -q in it. The above
 link and the original code both computed essentially the arccos of
 this
 formula without the negation of q. ??? Since acos(-v) == pi - acos(v)
 this would seem to negate the result and bias it by pi/3. Negating it
 won't affect the eventual cosine, but the bias by pi/3 will. Am I
 missing something?

I think you are. What's going on is that our code is computing
ret[0] = t2, ret[1] = t0, ret[2] = t1, where (t0, t1, t2 are the tk's
from the wikipedia link).

Let X = (3q/2p)*sqrt(-3/p) where p and q are the ones from the wikipedia
article, not our code.
So, when we do ret[0] = t * cos(phi), that's:
  = t * cos(acos(-X))
  = t * cos(pi/3 - acos(X))
  = t * cos(acos(X) - pi/3)
  = t * cos(acos(X) - pi/3 - pi)
  = t * cos(acos(X) - 2*2*pi/3)
  = t2

ret[0] and ret[1] are very similar to prove - you just add/subtract pi/3
from the argument to cos.

I unfortunately don't have access to the icedtea servers at this moment,
so I attached a patch. I hope that's ok.

Regards,
Denis.
--- dashing/2d/jdk/src/share/classes/java/awt/geom/CubicCurve2D.java	2010-12-24 10:01:45.378556843 -0500
+++ cc2d/2d/jdk/src/share/classes/java/awt/geom/CubicCurve2D.java	2010-12-24 11:01:42.205614715 -0500
@@ -1083,24 +1083,63 @@
  * @since 1.3
  */
 public static int solveCubic(double eqn[], double res[]) {
-// From Numerical Recipes, 5.6, Quadratic and Cubic Equations
-double d = eqn[3];
-if (d == 0.0) {
-// The cubic has degenerated to quadratic (or line or ...).
+// From Graphics Gems:
+// http://tog.acm.org/resources/GraphicsGems/gems/Roots3And4.c
+final double d = eqn[3];
+if (d == 0) {
 return QuadCurve2D.solveQuadratic(eqn, res);
 }
-double a = eqn[2] / d;
-double b = eqn[1] / d;
-double c = eqn[0] / d;
-int roots = 0;
-double Q = (a * a - 3.0 * b) / 9.0;
-double R = (2.0 * a * a * a - 9.0 * a * b + 27.0 * c) / 54.0;
-double R2 = R * R;
-double Q3 = Q * Q * Q;
-a = a / 3.0;
-if (R2  Q3) {
-double theta = Math.acos(R / Math.sqrt(Q3));
-Q = -2.0 * Math.sqrt(Q);
+/* normal form: x^3 + Ax^2 + Bx + C = 0 */
+final double A = eqn[2] / d;
+final double B = eqn[1] / d;
+final double C = eqn[0] / d;
+
+
+//  substitute x = y - A/3 to eliminate quadratic term:
+// y^3 +Py + Q = 0
+//
+// Since we actually need P/3 and Q/2 for all of the
+// calculations that follow, we will calculate
+// p = P/3
+// q = Q/2
+// instead and use those values for simplicity of the code.
+
+double sq_A = A * A;
+double p = 1.0/3 * (-1.0/3 * sq_A + B);
+double q = 1.0/2 * (2.0/27 * A * sq_A - 1.0/3 * A * B + C);
+
+/* use Cardano's formula */
+
+double cb_p = p * p * p;
+double D = q * q + cb_p;
+
+int num;
+// XXX: we consider 0 to be anything within 1e-9 of 0.
+// Is this really right? Maybe we should use a bound that 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-23 Thread Jim Graham

Hi Denis,

Line 1099 - I decided to check out Cordano's method and noticed a 
discrepancy.  The comment here says we are calculating the p and q for 
this equation, but the values assigned to the p and q variables in lines 
1102,1103 happen to be p/3 and q/2.  That's fine because almost all of 
the values needed in the remaining logic in Cordano's method actually 
need those values instead of the original p and q so it makes sense to 
calculate them up front.  Unfortunately, this means that the names here 
and the values assigned to them and the comment above them conflict.  If 
the variables could be named p/3 and q/2 then all would be clear, 
but I don't know how to do that naming very easily.  Perhaps the comment 
could be simply reworded:


// substitute blah blah blah
//x^3 + Px + Q = 0
// Since we actually need P/3 and Q/2 for all of the
// calculations that follow, we will calculate
//p = P/3
//q = Q/2
// instead and use those values for simplicity of the code.

Line 1105 - single quotes in comments freaks out my version of gnuemacs. 
 I usually try to avoid them, except in pairs, but there isn't a better 
way to word this comment.  :-(


Lines 1157-1163 - the old code used to copy the eqn before it got 
clobbered with roots.  Here it is too late.  You probably need to move 
this code up near line 1135 before the 3 roots are stuffed into the res 
array.  (Did you test the eqn==res case?)


I noticed that the Casus irreducibilis case isn't in Cordano's method. 
 He only finds roots for the 1 and 2 root case and punts for 3 roots. 
So, this is someone else's method.  It would be nice to figure out who 
or what and list a reference, even though the Graphics Gems and the old 
code didn't.  The closest reference I can find is unattributed on 
Wikipedia, but you could include it in a comment for reference:


http://en.wikipedia.org/wiki/Cubic_function#Trigonometric_.28and_hyperbolic.29_method

Line 1133 - I don't understand why that term has -q in it.  The above 
link and the original code both computed essentially the arccos of this 
formula without the negation of q.  ???  Since acos(-v) == pi - acos(v) 
this would seem to negate the result and bias it by pi/3.  Negating it 
won't affect the eventual cosine, but the bias by pi/3 will.  Am I 
missing something?


...jim

On 12/20/2010 9:36 AM, Denis Lila wrote:

Hi Jim.


Lines 1094-1096, they could also be NaN if any of the numerators were
also zero and these tests might fail (but only for the case of all of
them being zero I guess, otherwise one of the other divisions would
result in infinity). Are accidental infinities (caused by overflow
rather than d==0.0) really a problem for the code to handle?


I'm not sure if they're a problem, but I thought that case should have
been handled just for robustness. However, I've changed the test
to d==0 because testing for infinities should be done, but not there.
For example, if the constant term was huge and d==0.5 we could get
an infinity but that shouldn't really be handled as a quadratic
polynomial. I will deal better with these cases in a future webrev.


I just noticed that the code you are replacing actually used to refine
the roots so maybe you should do some of this magic.


I missed that in the original code. I changed it now.


Also, in the webrev you'll find five regression tests that I would like
to push to openjdk7. They test for various problems the rendering engine
used to have. They're all pretty simple and I would appreciate it if you
could take a quick look at them. They're in the same webrev as cc2d because
it was more convenient for me, but obviously when/if they're pushed they
will be a separate changeset.

One more thing: there is a regression test for the rendering engine
called TestNPE that I think is problematic because it doesn't
necessarily test the rendering engine. It just draws an antialiased
line, which could be handled in any number of ways, so it's not very
robust. In fact, after we're done with the parallelogram pipelines,
the code that used to throw the NPE won't even execute, making this
test useless. We should either discard it or change it to use the
rendering engine explicitly, like my tests do. What do you think?

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-23 Thread Jim Graham
The regression tests for this bug do not call the method directly.  They 
may exercise the function indirectly in some pipelines, but not all 
pipelines will use this method (the current version of Pisces in OpenJDK 
doesn't even use it until you integrate your other changes as far as I 
know).


I'd write a regression test for this bug that is directly applicable to 
the method being tested (find what values are being handed to the method 
by these test cases and then just call Cubic.solveCubic directly with 
those values and figure out if the answers are reasonable).


If you want to include these rendering tests as extra verification along 
with your other changes, then that is fine.


Also, I think we might have a script that forceably checks the value of 
the @bug tag and ensures that it is a legal bug database number, so 
using a RedHat bug number won't work very well.  Is there an existing 
bug that this could be tagged with?


...jim

On 12/20/2010 9:36 AM, Denis Lila wrote:

Hi Jim.


Lines 1094-1096, they could also be NaN if any of the numerators were
also zero and these tests might fail (but only for the case of all of
them being zero I guess, otherwise one of the other divisions would
result in infinity). Are accidental infinities (caused by overflow
rather than d==0.0) really a problem for the code to handle?


I'm not sure if they're a problem, but I thought that case should have
been handled just for robustness. However, I've changed the test
to d==0 because testing for infinities should be done, but not there.
For example, if the constant term was huge and d==0.5 we could get
an infinity but that shouldn't really be handled as a quadratic
polynomial. I will deal better with these cases in a future webrev.


I just noticed that the code you are replacing actually used to refine
the roots so maybe you should do some of this magic.


I missed that in the original code. I changed it now.


Also, in the webrev you'll find five regression tests that I would like
to push to openjdk7. They test for various problems the rendering engine
used to have. They're all pretty simple and I would appreciate it if you
could take a quick look at them. They're in the same webrev as cc2d because
it was more convenient for me, but obviously when/if they're pushed they
will be a separate changeset.

One more thing: there is a regression test for the rendering engine
called TestNPE that I think is problematic because it doesn't
necessarily test the rendering engine. It just draws an antialiased
line, which could be handled in any number of ways, so it's not very
robust. In fact, after we're done with the parallelogram pipelines,
the code that used to throw the NPE won't even execute, making this
test useless. We should either discard it or change it to use the
rendering engine explicitly, like my tests do. What do you think?

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-17 Thread Jim Graham

Hi Denis,

Lines 1094-1096, they could also be NaN if any of the numerators were 
also zero and these tests might fail (but only for the case of all of 
them being zero I guess, otherwise one of the other divisions would 
result in infinity).  Are accidental infinities (caused by overflow 
rather than d==0.0) really a problem for the code to handle?


I don't have any recommendations on your comment about the code that 
tests q for zero.  You could probably check if 2*u and -u were distinct 
as an alternate test, but that would cost you a cbrt() call.  In 
general, though, I'm guessing it's a rare case so saving the call to 
cbrt() is probably not important.  I will note, though, that if a number 
is very close to 0, but not 0, then its cube root will be a larger 
number than the original.


I just noticed that the code you are replacing actually used to refine 
the roots so maybe you should do some of this magic.  However, it only 
bothered to refine the roots if there were 3 roots and they were near 0 
or 1 (because that might cause the caller to reject the root if it fell 
on the wrong side of 0 or 1).  Either way, look and see if fixRoots() 
and its friends are dead code now and/or if they should be replaced with 
your refinement techniques below.


I tried to review the code for correctness of formulae, but since I 
never really understood how the first version worked (I just copied it 
from Numerical Recipes), all I could do was to compare to the old 
version that it was similar to.  Frustratingly, the variable names 
conflicted and one of the values was calculated with reversed sign so it 
ended up being more frustrating than educational.  :-(


Since you apparently tested the new code extensively and got it from a 
source that had some amount of editorial review - I'll trust that 
process instead of my crossed eyes...


...jim

On 12/15/2010 7:13 AM, Denis Lila wrote:

Hi Jim.


Also, I wrote new hit testing code in jdk6 that used bezier recursion
to compute the values and it ran way faster than any root-finding methods
(mainly because all you have to worry about is subdividing enough that
the curve can be classified as above, below, or to the left or right
and you're done), so the containment methods could probably be fixed by
simply using the new code in sun.awt.geom.Curve and this method could
be updated with the new equations you found and left as an approximate
method like it always has been?


Well, I already have half the code I need to implement the ideas I
wrote, so... why not?

However, making solveCubic that good does not really seem to be a high
priority, so how about we quickly push just the new implementation I
found so we can fix most cases where an obvious root is missed, then
push this dashing/AA webrev (which will depend on the new solveCubic),
then I can fix the implementation of intersect using the recursive
subdivision you mentioned, and then I can take my time finishing
the implementation of the ideas from my last e-mail (these bugs/rfes
have waited 10+ years - they can wait 1-2 more months). Right now,
I would like to give priority to better pisces support of the new
parallelogram pipes and this bug:
https://bugzilla.redhat.com/show_bug.cgi?id=662230

Here's the webrev for the new solveCubic implementation:
http://icedtea.classpath.org/~dlila/webrevs/cc2d/webrev/

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-15 Thread Denis Lila
Hi Jim.

 Also, I wrote new hit testing code in jdk6 that used bezier recursion
 to compute the values and it ran way faster than any root-finding methods
 (mainly because all you have to worry about is subdividing enough that
 the curve can be classified as above, below, or to the left or right
 and you're done), so the containment methods could probably be fixed by 
 simply using the new code in sun.awt.geom.Curve and this method could
 be updated with the new equations you found and left as an approximate 
 method like it always has been?

Well, I already have half the code I need to implement the ideas I
wrote, so... why not?

However, making solveCubic that good does not really seem to be a high
priority, so how about we quickly push just the new implementation I
found so we can fix most cases where an obvious root is missed, then
push this dashing/AA webrev (which will depend on the new solveCubic),
then I can fix the implementation of intersect using the recursive
subdivision you mentioned, and then I can take my time finishing
the implementation of the ideas from my last e-mail (these bugs/rfes
have waited 10+ years - they can wait 1-2 more months). Right now,
I would like to give priority to better pisces support of the new
parallelogram pipes and this bug:
https://bugzilla.redhat.com/show_bug.cgi?id=662230

Here's the webrev for the new solveCubic implementation:
http://icedtea.classpath.org/~dlila/webrevs/cc2d/webrev/

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-14 Thread Denis Lila
Hi Jim.

 You might want to submit it as a separate push and get credit for
 fixing 4645692 (solveCubic doesn't return all answers),

Sure, that sounds good. Reading through the code I found, I spotted
a few things that might have been problematic in some extremely
rare cases. I've been working on making it extra robust. I think I'm
pretty close to finishing.
I have one question though: how fast does this have to be? I can come
up with fairly reasonable examples for which both CubicCurve2D.solveCubic
and the implementation I found give very inaccurate results
(i.e. evaluating the cubic on the computed roots gives numbers as
high as 1e-4). I read on the bug reports that what we should do is
treat the closed form as a crude approximation and then use the Newton
method to really find the roots. I like this idea (with perhaps the
exception of the Newton method - I think we might be better off
using something like false position, or even simply a binary search).
The binary search gives results that when evaluated are as small as
1e-17 which is as close to correct as possible in double precision
(because my termination condition was while(middle != start  middle != end)).
That didn't even take an outrageous number of iterations - 77 was the
highest I observed.

 4724552
 4493128
 4074742
 4724556
 (etc.  Those were just the bugs I found on the first 2 pages of a bug
 database search)
 Double (triple, etc.) credit - woohoo!  ;-)

Isn't there some sort of diminishing returns after the first duplicate ;-)

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-14 Thread Jim Graham

Hi Denis,

That sounds like some very good ideas for making this method very accurate.

On the other hand, we're starting to get into the territory where an 
advanced math package should be catering to these requirements.  The 
solveCubic was an internal helper function for implementing the hit 
testing methods that I decided to make public back in 1.2 days.  There's 
a question on how much accuracy we should bother with.


Also, I wrote new hit testing code in jdk6 that used bezier recursion to 
compute the values and it ran way faster than any root-finding methods 
(mainly because all you have to worry about is subdividing enough that 
the curve can be classified as above, below, or to the left or right and 
you're done), so the containment methods could probably be fixed by 
simply using the new code in sun.awt.geom.Curve and this method could be 
updated with the new equations you found and left as an approximate 
method like it always has been?


...jim

On 12/14/2010 5:57 PM, Denis Lila wrote:

Hi Jim.


How big are these errors expressed as multiples of the ULP of the
coefficients?  Obviously 1e-17 is a lot smaller than 1e-4, but was
1e-17
representing just a couple of bits of error or was it still way off
with respect to the numbers being used? And were these fairly obscure
equations that were off?


The coefficients I used were eqn={-0.1, 0, 1, 1e-7} so compared to the ulps
of the coefficients, 1e-4 is pretty large.

I'm about to go now, but I would like to write this idea first:
it seems to me like the number of roots reported is much more
important than whether their accuracy is 1e-4 or 1e-17. So,
how about we solve for the roots of the derivative (which can be
done very quickly). Computing lim{x--+/-inf}f(x) is very easy
(just a test on the most significant coefficient). We can then
evaluate the polynomial on the critical points and this would
allow us to very quickly compute the exact number of roots. If
the number computed using the closed form formula does not
match the real number, we do some refining work.

If we really wanted to optimize, we could also record how close
constants like D and q are to 0, and if they're within a certain
threshold, we could mark the roots they spawn as suspicious,
and only do the test in the above paragraph (i.e. solving for
critical points) if there are suspicious roots. And if the
computed numbers of roots don't match up, we could concentrate
on refining only the suspicious roots.

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-13 Thread Denis Lila
Hi Jim.

 Woohoo!

:)

  How often do we end up needing getTCloseTo in practice?
 
  It depends on the ratios of the lengths of the sides of the control
  polygon. The closer they are to 1, the less we need it. I'm not
 sure
  how to answer more precisely - for that I would need a
 representative
  sample of curves drawn in practice.
 
 I was thinking of, say, when applied to a circle.  Does that get by 
 without needing getTCloseTo?

I tested it on a couple of quarter circles of greatly varying radii,
and surprisingly, it does get by without needing getTCloseTo (or its
equivalent, after your flattening proposal in your previous e-mail).

 Good job!

Well, thanks for all your help.
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-13 Thread Denis Lila
Hi Jim.

 With respect to finding a cubic root, currently you are doing that in
 2 dimensions, but what if we converted to 1 dimension?
 Consider that the control polygon is fairly linear.  What if we 
 rotated our perspective so that it was horizontal and then squashed it
 flat?  Consider instead a 1 dimensional bezier with control values
 of:
 (where |mn| is the length of the m-n control polygon of the original
 curve - sum of all segments from point m to point n)
 0.0, |01|, |02|, |03|

I had thought of something like this but I was afraid that the loss of
Curve.java:141-152 would hurt accuracy. I implemented this though, and
testing shows that that's not a problem. This should also double the
performance of the computation since we only run one cubic root finder,
and that was the major bottleneck.
I updated the webrev.

Should I remove some no longer needed methods, like getTCloseTo?

 Solve that 1 dimensional bezier for v=(leaflen*polylen)/linelen...

Don't you mean (targetLength - lenAtLastT) * polylen / leaflen?

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-13 Thread Denis Lila
Hi again.

I found an implementation of a closed form cubic root solver
(from graphics gems):
http://read.pudn.com/downloads21/sourcecode/graph/71499/gems/Roots3And4.c__.htm

I did some micro benchmarks, and it's about 25% slower than the one I have.
I'm thinking we should use it anyway because it's much better in every
other way: it finds all roots, it doesn't make its callers give it a root
array that is longer than the total number of roots, and most importantly,
it doesn't fail because of an iteration upper bound. From my testing, I noticed
that this happens too frequently for comfort in my cubicRootsInAB. I haven't
noticed any rendering artifacts caused by this, but I also haven't tried
rendering every possible curve and it's better to prevent bugs than fix them.

What do you think?

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-13 Thread Jim Graham
Very nice!  How does it compare to CubicCurve.solveCubic() (which I know 
has issues with the 2 root case, but I got it from a reliable source - 
some textbook on Numerical Recipes)?


Also, one area that I had issues with the version I used in CC2D was 
that it chose a hard cutoff to classify the number of points and 
floating point errors might cause a given cubic to jump over that point 
despite the fact that the equation was of the other form.  Hopefully 
that jumping between root counts only happens when two roots are very 
close to each other so that the choice is between listing N or 
N+epsilon and N-epsilon - I can live with that inaccuracy, but it 
seemed like the version in CC2D might choose between it's either a 
single root of 4.25, or three roots of -127.3, 23.5, and 42.6 and I 
would scratch my head and think - wow, what a difference a bit made!


Luckily I don't think we actually ever relied on CC2D.solveCubic for 
correctness in any of our code, but it would be nice to fix it if a more 
stable method is available.


...jim

On 12/13/2010 12:23 PM, Denis Lila wrote:

Hi again.

I found an implementation of a closed form cubic root solver
(from graphics gems):
http://read.pudn.com/downloads21/sourcecode/graph/71499/gems/Roots3And4.c__.htm

I did some micro benchmarks, and it's about 25% slower than the one I have.
I'm thinking we should use it anyway because it's much better in every
other way: it finds all roots, it doesn't make its callers give it a root
array that is longer than the total number of roots, and most importantly,
it doesn't fail because of an iteration upper bound. From my testing, I noticed
that this happens too frequently for comfort in my cubicRootsInAB. I haven't
noticed any rendering artifacts caused by this, but I also haven't tried
rendering every possible curve and it's better to prevent bugs than fix them.

What do you think?

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-13 Thread Jim Graham

Hi Denis,

Those sound like just the kind of problems I believed existed in the 
CC2D algorithm.


You might want to submit it as a separate push and get credit for fixing 
4645692 (solveCubic doesn't return all answers), and maybe even the 
following failures in the containment methods (which could be closed as 
dups if this fixes the problems) as well:


4724552
4493128
4074742
4724556
(etc.  Those were just the bugs I found on the first 2 pages of a bug 
database search)


Double (triple, etc.) credit - woohoo!  ;-)

...jim

On 12/13/2010 2:30 PM, Denis Lila wrote:



Very nice!  How does it compare to CubicCurve.solveCubic() (which I
know
has issues with the 2 root case, but I got it from a reliable source
-
some textbook on Numerical Recipes)?


I wrote a tests that generated 2559960 polynomials, and in 2493075 of
those, the computed roots were identical to within 1e-9. They were
different in the remaining 66885 cases, so that's 97.4% agreement.

I've looked through some of the differences, and in every case the
function from graphics gems is better in one of two ways:
1. the gg version will report more roots than the cc2d version, in
which case the polynomial has a double root and the cc2d version
completely misses it (example poly: a=19.00, b=-20.00,
c=-17.00, d=18.00, where cc2d misses the root at 1).

2. the gg version will report fewer roots than the cc2d version, in
which case there was a 0 root and the cc2d version incorrectly split
it into -1e-163 and 1e-163.

So, the graphics gems version seems to be much more stable. It
does have a problem where it can return NaN sometimes, because it
assumes that the polynomial is not a quadratic one, but that can
easily be fixed.

So, should I put this new cubic root finder in CubicCurve.solveCubic
and use that in pisces?

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-10 Thread Denis Lila
Hi Jim.

 Actually, even if the lengths aren't close the lengths may give you 
 enough information about the acceleration along the curve that you can
 do a decent approximation of the accelerated T value.  The T could be
 biased by some formula that is weighted by the ratios of the control 
 polygon lengths.  As a very crude example, say you assumed that if the
 scaled leaf length fell into the first polygon segment's length then t
 should be proportionally a value from 0 to 1/3, and if it fell between
 the first poly len and the second then it would be proportionally a 
 value from 1/3 to 2/3, etc.  The code might look like this:

I implemented this, and I'm not sure how to use this new approximation.
I mean, currently there are really two t's. The first one is the parameter
along the line connecting the 2 endpoints of the curve and the second
is the result that we return. We can't use this new approximation to
replace the first t, because for a curve like
(0,0),(1000,0),(1000,0),(1000,0) and a desired length of 500, the t
would be 1/6, so the computed (x,y) would be (1000/6,0) instead of
(500,0), which would be right (and which is what we compute now).

The only sensible way to use this kind of approximation would be as a
direct replacement for getTCloseTo. I tried that, and its quality to
speed ratio compared to getTCloseTo is remarkably good, but it's not
really usable because the differences are very noticeable.
I'll try to implement a good cubic root finder this weekend, and maybe
then getTCloseTo will be much faster and we won't have to worry about
this.

 (Also, note that in the original code we probably would have just been
 dashing along the flattened curve anyway and so we might have just
 been 
 using the raw linear t in that case - so anything we do here is a 
 refinement of what we used to do and icing on the cake to some
 extent)...

I'd say the dashing precision is better than what we used to have. It's
only slightly better, but even that is impressive when you consider that
we were doing up to 1024 subdivisions before, and now it's only 16, I think.

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-10 Thread Denis Lila
Hi Jim.

 By without this optimization do you mean back when you did a full
 scan for the proper T?

Yes. The improvement shown by the bench marks is substantial.

 Then this is great news!

Indeed :-)

 How often do we end up needing getTCloseTo in practice?

It depends on the ratios of the lengths of the sides of the control
polygon. The closer they are to 1, the less we need it. I'm not sure
how to answer more precisely - for that I would need a representative
sample of curves drawn in practice.

However, I did run dashing and stroking benchmarks. Stroking shows
25% speedup (because of the refinements to the transform pipeline)
and cubic dashing has improved by 80%. Of course, all this is useless
if I've done something to make things look horrible, so I'm going to
run the gfx tests again.

Regards,
Denis.

- Jim Graham james.gra...@oracle.com wrote:

 
   ...jim
 
 On 12/8/2010 1:54 PM, Denis Lila wrote:
  Hi Jim.
 
  How about if the 3 segments of the control polygon are all close
 to
  each other in length and angle, then the optimization applies. 
 Is
  that easy to test?
 
  Hmm, that would actually be extremely easy to test and it would
 cost
  almost nothing. We already compute the control polygon lengths in
 onLeaf, and
  we're already assuming that every leaf is flat enough, so we
  probably don't even need to check the angles. Comparing lengths
 should be good
  enough.
  I'll try this out.
 
  I implemented this and updated the webrev. I tested on a few curves
 with very high
  and very low accelerations. The results were identical to what I
 used to get
  without this optimization. When the curve has no acceleration, all
 calls of getTCloseTo
  are skipped.
 
  Regards,
  Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-10 Thread Denis Lila
 Of course, all this is useless
 if I've done something to make things look horrible, so I'm going to
 run the gfx tests again.

I just ran them. All is good. The only change compared to the old test
result is that the number of dashed round rectangles that are identical
to what is produced by the closed source code has doubled. This isn't
all that significant, since it has gone from 4 identical images out of
162 total images to images to 8 out of 162, but it's still nice, and it
definitely means that nothing has gotten worse.

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-10 Thread Jim Graham

On 12/10/2010 8:27 AM, Denis Lila wrote:

Hi Jim.

Yes. The improvement shown by the bench marks is substantial.


Then this is great news!


Indeed :-)


Woohoo!


How often do we end up needing getTCloseTo in practice?


It depends on the ratios of the lengths of the sides of the control
polygon. The closer they are to 1, the less we need it. I'm not sure
how to answer more precisely - for that I would need a representative
sample of curves drawn in practice.


I was thinking of, say, when applied to a circle.  Does that get by 
without needing getTCloseTo?



However, I did run dashing and stroking benchmarks. Stroking shows
25% speedup (because of the refinements to the transform pipeline)
and cubic dashing has improved by 80%. Of course, all this is useless
if I've done something to make things look horrible, so I'm going to
run the gfx tests again.


Good job!

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-10 Thread Jim Graham

Hi Denis,

The example I gave was intended to be very crude - I was simply 
describing the technique, but as I said it would require better math to 
really know what the right formula would be.


With respect to finding a cubic root, currently you are doing that in 2 
dimensions, but what if we converted to 1 dimension?


Consider that the control polygon is fairly linear.  What if we 
rotated our perspective so that it was horizontal and then squashed it 
flat?  Consider instead a 1 dimensional bezier with control values of:


(where |mn| is the length of the m-n control polygon of the original 
curve - sum of all segments from point m to point n)


0.0, |01|, |02|, |03|

Solve that 1 dimensional bezier for v=(leaflen*polylen)/linelen...

...jim

On 12/10/2010 8:23 AM, Denis Lila wrote:

Hi Jim.


Actually, even if the lengths aren't close the lengths may give you
enough information about the acceleration along the curve that you can
do a decent approximation of the accelerated T value.  The T could be
biased by some formula that is weighted by the ratios of the control
polygon lengths.  As a very crude example, say you assumed that if the
scaled leaf length fell into the first polygon segment's length then t
should be proportionally a value from 0 to 1/3, and if it fell between
the first poly len and the second then it would be proportionally a
value from 1/3 to 2/3, etc.  The code might look like this:


I implemented this, and I'm not sure how to use this new approximation.
I mean, currently there are really two t's. The first one is the parameter
along the line connecting the 2 endpoints of the curve and the second
is the result that we return. We can't use this new approximation to
replace the first t, because for a curve like
(0,0),(1000,0),(1000,0),(1000,0) and a desired length of 500, the t
would be 1/6, so the computed (x,y) would be (1000/6,0) instead of
(500,0), which would be right (and which is what we compute now).

The only sensible way to use this kind of approximation would be as a
direct replacement for getTCloseTo. I tried that, and its quality to
speed ratio compared to getTCloseTo is remarkably good, but it's not
really usable because the differences are very noticeable.
I'll try to implement a good cubic root finder this weekend, and maybe
then getTCloseTo will be much faster and we won't have to worry about
this.


(Also, note that in the original code we probably would have just been
dashing along the flattened curve anyway and so we might have just
been
using the raw linear t in that case - so anything we do here is a
refinement of what we used to do and icing on the cake to some
extent)...


I'd say the dashing precision is better than what we used to have. It's
only slightly better, but even that is impressive when you consider that
we were doing up to 1024 subdivisions before, and now it's only 16, I think.

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-08 Thread Denis Lila
Hi Jim.

 The main problem is that must doesn't exist for IEEE floating point
 numbers.  You can find the root for one of the endpoints and it may 
 return t = -.1 even though the value exactly matched the
 endpoint, but after all the math was said and done the answer
 it came up had the bit pattern for a tiny negative number, not
 0 (or 1.0001).  That t value would be rejected and then you'd
 have no roots.

That's true. That's what I meant when I said finite precision math
doesn't necessarily care what calculus says ;-)

 It's not hurting anything and we may find it useful in other contexts.
 We probably should have put it in a more generic package.

Sounds good. I suppose we can move it if the need ever arises.

 Shouldn't it be [A, B]?

I thought about this when implementing it, but I don't think it mattered
whether it was closed or half open, and the closed interval would have been
somewhat more awkward to implement.

 getMaxAcc functions - don't we want the furthest value from 0,
 positive or negative?  You are looking for most positive value
 and negative accelerations are equally problematic, aren't they?
 If so then these functions need some work.

You're right about both, but there's a much more serious problem that I
didn't think of when writing them: the value I compute in the if
statement in Dasher:355 is not an upper bound on the acceleration of
the curve. The acceleration is:
C'(t).dot(C''(t))/len(C'(t)) which in terms of the parameter polynomials is
(x'(t)*x''(t) + y'(t)*y''(t))/sqrt(x'(t)^2 + y'(t)^2)
What those functions would compute if they were correct would be 
max(abs(x''(t))) and max(abs(y''(t))), and the sum of these is not
closely related to the maximum absolute acceleration, which is what we
want.
Without the upper bound property, I don't think it's a very meaningful
test, and I think we should abandon this optimization. Do you agree?

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-08 Thread Jim Graham

On 12/8/2010 9:37 AM, Denis Lila wrote:

Shouldn't it be [A, B]?


I thought about this when implementing it, but I don't think it mattered
whether it was closed or half open, and the closed interval would have been
somewhat more awkward to implement.


I'm not sure how the closed interval is awkward.  Isn't it just proper 
choice of = and = vs.  and  in the testing method?



getMaxAcc functions - don't we want the furthest value from 0,
positive or negative?  You are looking for most positive value
and negative accelerations are equally problematic, aren't they?
If so then these functions need some work.


You're right about both, but there's a much more serious problem that I
didn't think of when writing them: the value I compute in the if
statement in Dasher:355 is not an upper bound on the acceleration of
the curve. The acceleration is:
C'(t).dot(C''(t))/len(C'(t)) which in terms of the parameter polynomials is
(x'(t)*x''(t) + y'(t)*y''(t))/sqrt(x'(t)^2 + y'(t)^2)
What those functions would compute if they were correct would be
max(abs(x''(t))) and max(abs(y''(t))), and the sum of these is not
closely related to the maximum absolute acceleration, which is what we
want.
Without the upper bound property, I don't think it's a very meaningful
test, and I think we should abandon this optimization. Do you agree?


How about if the 3 segments of the control polygon are all close to 
each other in length and angle, then the optimization applies.  Is that 
easy to test?


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-08 Thread Denis Lila
 I'm not sure how the closed interval is awkward.  Isn't it just proper
 choice of = and = vs.  and  in the testing method?

In the filtering function, yes, but I was referring to cubicRootsInAB in
Helpers:122-133 where we iterate through intervals. For each interval,
we have the values of the function at the ends, and if the left one
is 0, we just add it as a zero and skip the call to CubicNewton. In order
to get roots in [A,B], we would either have to test both endpoints (which
would be more expensive and it would force us to find a way of avoiding
duplicate roots), or we would have to deal with the last interval as
a special case. The latter is not that bad, but it is more awkward than
what we have now.

 How about if the 3 segments of the control polygon are all close to 
 each other in length and angle, then the optimization applies.  Is
 that easy to test?

Hmm, that would actually be extremely easy to test and it would cost almost
nothing. We already compute the control polygon lengths in onLeaf, and
we're already assuming that every leaf is flat enough, so we probably
don't even need to check the angles. Comparing lengths should be good enough.
I'll try this out.

Thank you,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-08 Thread Jim Graham

Hi Denis,

On 12/8/2010 12:04 PM, Denis Lila wrote:

I'm not sure how the closed interval is awkward.  Isn't it just proper
choice of = and= vs.  and in the testing method?


In the filtering function, yes, but I was referring to cubicRootsInAB in
Helpers:122-133 where we iterate through intervals. For each interval,
we have the values of the function at the ends, and if the left one
is 0, we just add it as a zero and skip the call to CubicNewton. In order
to get roots in [A,B], we would either have to test both endpoints (which
would be more expensive and it would force us to find a way of avoiding
duplicate roots), or we would have to deal with the last interval as
a special case. The latter is not that bad, but it is more awkward than
what we have now.


Perhaps it would be better if RootsInAB would advertise that it is 
returning approximations of a high precision, but they won't be exact 
and roots near the endpoints may not be caught and so the caller should 
be prepared to evaluate the endpoints manually to see if they represent 
interesting values for the purposes of why the roots were requested.


And then do that in the functions that call it.


How about if the 3 segments of the control polygon are all close to
each other in length and angle, then the optimization applies.  Is
that easy to test?


Hmm, that would actually be extremely easy to test and it would cost almost
nothing. We already compute the control polygon lengths in onLeaf, and
we're already assuming that every leaf is flat enough, so we probably
don't even need to check the angles. Comparing lengths should be good enough.
I'll try this out.


Actually, even if the lengths aren't close the lengths may give you 
enough information about the acceleration along the curve that you can 
do a decent approximation of the accelerated T value.  The T could be 
biased by some formula that is weighted by the ratios of the control 
polygon lengths.  As a very crude example, say you assumed that if the 
scaled leaf length fell into the first polygon segment's length then t 
should be proportionally a value from 0 to 1/3, and if it fell between 
the first poly len and the second then it would be proportionally a 
value from 1/3 to 2/3, etc.  The code might look like this:


polylen = l01 + l12 + l23
linelen = l03
// If l01==l12==l23 then most of the following becomes
// a NOP and t=leaflen/linelen
polyleaflen = leaflen * polylen / linelen;
if polyleaflen  l01 then t = (polyleaflen/l01)/3
else if polyleaflen  l01+l12 then t = ((pll-l01)/l12 + 1)/3
else if t = ((pll-l01-l12)/l23)+2)/3

An even better approximation would involve some more math, but that 
might be better than simply using the linear interpolation along the 
segment connecting their endpoints.


(Also, note that in the original code we probably would have just been 
dashing along the flattened curve anyway and so we might have just been 
using the raw linear t in that case - so anything we do here is a 
refinement of what we used to do and icing on the cake to some extent)...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-07 Thread Denis Lila
Hi Jim.

 I'm sure you will likely find a root, but the method you are using is
 roots*inAB* which may throw the root out because it is out of range,
 no?

I'm sure some roots will be thrown out, but I think in every call of
getTCloseTo there will be at least one root that isn't thrown out. That's
because our [A,B) is always [0, 1), and the point x,y that we pass
to getTCloseTo lies on the line joining the endpoints of the curve on
which getTCloseTo is called, so there must be some root in [0, 1). In fact,
there will be two roots, one for each parametric polynomial.

 Read the IEEE spec on NaN.  It's a special value that has this bizarre
 property that it is the only number that is not equal to itself. ;-) 
 In fact, the test for NaN is usually if (x == x) notNaN else NaN. 
 If you want to be explicit and formal then you can use the static 
 Float.isNaN() method (which is essentially that test - x!=x).

Interesting. I did not know that. I fixed these.

 A lot of the lines before you test MaxAcc are not needed unless you go
 into the if.  In particular, x,y,[xy][01] are only used if you call 
 getTCloseTo().

Fixed.

 Actually I think you've updated the AFD code so I should really take a
 look... :-(
 
 ;-)

Oh, of course! I completely forgot about that. I think I only changed
the quadratic AFD code to take into account the constant acceleration of
quadratic curves.

 The problem is that normalization needs proper sub-pixel positioning
 so you need to hand it the true device space coordinates with proper 
 translation.  You need this:

Right, so I was wrong about normalization and translation being commutative.

 Using these two methods I don't think you need any transforms other
 than the original one - so all you need is strokerat which replaces both
 outat and inat and is either null (no special transform needed) or the
 original AT when special transforms are needed...

I implemented your way. I couldn't get rid of outat, however. In that case
where we have to apply a non orthogonal transform with no normalization we
want to just apply the transform at the end of stroking and feed stroker
untransformed paths. So, now I have both outat and strokerat. At least
one of them must always be null. In the case I mention above, strokerat will
be null, and the calls to *deltaTransformConsumer(pc2d, strokerat) won't
do anything, but the transformConsumer(pc2d, outat) call will take care of
the post stroker transformation.

I don't think the TranslateFilter will ever be used, because transformConsumer
is now called only with outat, which cannot be a translation. So we may want
to remove it.

I also fixed the filterOutNotInAB function. It was violating cubicRootsInAB's
spec by filtering out everything not in (A, B), as opposed to everything not
in [A, B), which is what it should do.

I uploaded the new webrev.

Thank you,
Denis.

- Jim Graham james.gra...@oracle.com wrote:

 Hi Denis,
 
 On 12/6/2010 4:21 PM, Denis Lila wrote:
  Hi Jim.
 
  line 134 - what if numx or numy == 0 because only roots outside
 [0,1]
  were found?
 
  In this case lines 151-162 will execute, and nothing is wrong. The
 only
  problem is when both numx and numy are 0. This is certainly possible
 in
  the general case (but only with quadratic curves), but the way we're
 using
  this function, the intermediate value theorem guarantees at least
 one root
  will be found. Of course, finite precision math doesn't necessarily
 care
  what calculus has to say, but in this case I can't see how the root
 computation
  could fail. On the other hand, one could argue that this is exactly
 why we need
  to defend against this case, so I've added some checks.
 
 
  line 145 - what if d0 and d1 are both 0?  NaN results.  What if
 you
  just used a simple d0  d1 ? tx : ty - how far off would that
 be?
 
  I tried that out on a curve with very high acceleration, and it
 makes a noticeable
  difference. So, instead I'm using
  if (w0 == Float.NaN) {
  return tx;
  }
 
 Same thing on Dasher line 363 where you also test for NaN.
 
  line 357 - another optimization would be to check the acceleration
 in
  the range and if it is below a threshold then simply use the t
 from
  line 348 as the t for the split
 
  I like this. I tried implementing it. I haven't tested it yet
 though, and
  I still have to pick a good cutoff acceleration. For now I'm using
 1/leaflen.
  We definitely don't want it to just be a constant, since the longer
 the leaf,
  the worse it will be to allow acceleration, so for longer leaves we
 want to
  skip the getTCloseTo call only if the acceleration is smaller.
 
  Renderer.java:  Is this just a straight copy of what I was working
 on?
 
  I'm pretty sure it is, yes.
 
 
  TransformingPathConsumer2D:
 
  Any thoughts on whether we need translation in the scale filter
 and
  whether a 4-element non-translation filter would be useful?  I
 think
  the current code that drives this passes in the 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-06 Thread Jim Graham

Hi Denis,

On 12/6/2010 4:21 PM, Denis Lila wrote:

Hi Jim.


line 134 - what if numx or numy == 0 because only roots outside [0,1]
were found?


In this case lines 151-162 will execute, and nothing is wrong. The only
problem is when both numx and numy are 0. This is certainly possible in
the general case (but only with quadratic curves), but the way we're using
this function, the intermediate value theorem guarantees at least one root
will be found. Of course, finite precision math doesn't necessarily care
what calculus has to say, but in this case I can't see how the root computation
could fail. On the other hand, one could argue that this is exactly why we need
to defend against this case, so I've added some checks.


I'm sure you will likely find a root, but the method you are using is 
roots*inAB* which may throw the root out because it is out of range, no?


In looking at that method it looks like the cubic handling code tries 0 
and 1 in addition to the critical points it finds using a root, but the 
quadratic code that it chooses if a==0 will throw out all roots outside 
the 0,1 range and may end up with 0 answers.  The cubic code further can 
reject all of the points (if they are all non-zero and same sign) and 
also return no answers, but may have fewer cases where it would do that.


Still, my point was not that you might fail to find a root, but that the 
roots may get rejected and end up with no answers in range.



line 145 - what if d0 and d1 are both 0?  NaN results.  What if you
just used a simple d0  d1 ? tx : ty - how far off would that be?


I tried that out on a curve with very high acceleration, and it makes a 
noticeable
difference. So, instead I'm using
if (w0 == Float.NaN) {
return tx;
}


Read the IEEE spec on NaN.  It's a special value that has this bizarre 
property that it is the only number that is not equal to itself. ;-)  In 
fact, the test for NaN is usually if (x == x) notNaN else NaN.  If 
you want to be explicit and formal then you can use the static 
Float.isNaN() method (which is essentially that test - x!=x).


Same thing on Dasher line 363 where you also test for NaN.


line 357 - another optimization would be to check the acceleration in
the range and if it is below a threshold then simply use the t from
line 348 as the t for the split


I like this. I tried implementing it. I haven't tested it yet though, and
I still have to pick a good cutoff acceleration. For now I'm using 1/leaflen.
We definitely don't want it to just be a constant, since the longer the leaf,
the worse it will be to allow acceleration, so for longer leaves we want to
skip the getTCloseTo call only if the acceleration is smaller.


A lot of the lines before you test MaxAcc are not needed unless you go 
into the if.  In particular, x,y,[xy][01] are only used if you call 
getTCloseTo().



Renderer.java:  Is this just a straight copy of what I was working on?


I'm pretty sure it is, yes.


Actually I think you've updated the AFD code so I should really take a 
look... :-(


;-)


TransformingPathConsumer2D:

Any thoughts on whether we need translation in the scale filter and
whether a 4-element non-translation filter would be useful?  I think
the current code that drives this passes in the full transform and its
inverse, but it could just as easily do delta transforms instead and
save the adding of the translation components...


I thought about this long ago, but I wasn't sure it wouldn't break anything.
Today, I got a bit more formal with the math, and I think it's ok
to eliminate the translation. I've implemented this (though, I haven't had
time to test it yet. That's for tomorrow).


Right now you have (note that the common terminology for transform 
without translation is delta transform):


PathIterator
= DeltaAT = Normalize
= DeltaInverseAT = strokers
= FullAT = renderer

The problem is that normalization needs proper sub-pixel positioning so 
you need to hand it the true device space coordinates with proper 
translation.  You need this:


PathIterator
= FullAT = Normalize
= DeltaInverseAT = strokers
= DeltaAT = renderer

I would skip the creation of atNotTranslationPart and just inverse the 
original transform (since I don't think the inversion is affected by 
translation - you can see this in the calculations in 
AT.createInverse()), and then have the transforming consumers apply a 
delta transform - i.e. create a TPC2D.deltaTransformConsumer() method 
which would apply just the non-translation parts of AT to the consumer.


If you want to get really fancy with optimizations you could have an 
inverseDeltaTransformConsumer() method that would calculate the 
inversions on the fly to avoid construction of a scratch AT.  Since it 
is just weird transpose with signs and divide by the determinant in 
the most general case and even simpler (invert Mxx 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-18 Thread Denis Lila
Hi Jim.

I have a new webrev: 
http://icedtea.classpath.org/~dlila/webrevs/perfWebrev/webrev/

 How about looking more at the stroking end of the process and I'll dig
 a little more into optimal rasterization code.  I have a lot of
 experience with optimizing rasterizer code (and JNI if it comes to that), but
 very little with the curve manipulations involved in stroking (math is so 
 *hard* at my age ;-)...

I tried to do this. I used the netbeans compiler, and it said that a large
chunk of the time (about 12% is spent in curveTo). curveTo does almost nothing:
it just packs up the curve array and delegates to somethingTo. This makes me 
think
that there's not a whole lot that can be done to improve Stroker's performance 
(I'm
ok with this, since J2DBench and Java2DDemo animation frame rates both say that
non antialiased and non dashed performance is better than even closed source 
java).
I did make one small change though: I inlined the calls to dxat, dyat, ddxat, 
ddyat
in ROCsq because the profiler said that a lot of time was spent there. This 
made a
surprisingly large difference (but still not that great overall).

I also fixed the dashing performance degradation. I removed the binary sort, and
am now using getCubicRoots to find the t where to split. Another hugely 
significant
change was using Math.sqrt instead of Math.hypot in the implementation of 
Helpers.linelen.
I had been meaning to do this for a while, since sqrt is about 100 times faster 
on my machine
than hypot, but I didn't realize it would have such a large impact on dashing. 
Anyway, dashing
is now much, much faster than before. It is even faster than the flattening 
version we used
to use. The precision might not be as good as the current, slow implementation, 
but it's
only noticeable for curves with a lot of acceleration, and even then only if 
you do
a side by side comparison of the 2 implementations. The benchmarks display a 
200%-500%
improvement, so I think it is well worth it. Unfortunately curve dashing is 
still a bit
slower than the closed source counterpart, but not by much.

I also tweaked the AFD algorithm for quadratic curves. It's a bit faster now.

A while ago you made a suggestion on how to improve anti aliasing performance:

 Here are my thoughts:
 
   - Currently Renderer has more stages than we probably should have:
for (each full pixel row) {
for (each subpixel row) {
for (each curve on subpixel row) {
convert curve into crossing
crossing is (subpixelx:31 + dir:1)
}
sort crossings for subpixel row
apply wind rule and convert crossings into alpha deltas
}
convert alpha deltas into run lengths
}
for (each tile) {
convert run lengths into alphas
}
   I'm thinking we should be able to get rid of a couple of those stages...
 
   - One alternate process would be:
(all work is done in the tail end of the cache itself)
for (each full pixel row) {
for (each curve on full pixel row) {
convert curve into N subpixel crossings
(subpixelx:28 + subpixelfracty:3 + dir:1)
}
}
sort all crossings for entire pixel row
maybe collapse raw crossings into modified accumulated crossings
record start of row in index array
for (each tile) {
convert raw or collapsed crossings directly into alphas
}
   Note that we could simply leave the raw crossings in the cache and then
   process them in the tile generator, but that would require more storage
   in the cache since a given path would tend to have 8 different entries
   as it crossed every subpixel row.  If we collapse them, then we can sum
   the entries for a given x coordinate into a single entry and store:
(pixelx:25 + coveragedelta:7)
   where the coverage delta is a signed quantity up to N_SUBPIX^2 so it
   uses the same storage we needed for the subpixel parts of both X and Y
   plus the direction bit.  It likely needs a paired value in the next x
   pixel location just like our current alpha deltas needs as well.  If
   we wanted to optimize that then we could devote one more bit to the
   next pixel will consume all of the remainder of the N^2 coverage, but
   there would be cases where that would not be true (such as if the pixel
   row has only partial vertical coverage by the path).  It's probably
   simpler to just have deltas for every pixel.
 
   The storage here would likely be similar to the storage used for the
   current cache since the current RLE cache uses 2 full ints to store a
   coverage and a count.  And in cases where we have one pixel taking up
  

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-10 Thread Denis Lila
Hi Jim.

 Did you have to modify the AFD code for this (in terms of changing
 their limit constants to get good results)?

No, I didn't. By handling non monotonic curves, the AFD algorithm
is going through more iterations, but the only way in which this
could be a problem is through accumulation of numerical inaccuracies,
and I don't think we do enough iterations for this to start causing
perceptible problems. I haven't noticed anything in all my testing.

Regards,
Denis.

- Jim Graham james.gra...@oracle.com wrote:

 Hi Denis,
 
 On 11/9/2010 3:06 PM, Denis Lila wrote:
  I see. In that case, I think it's a good idea if we don't make
 curves
  monotonic. I already did this, by moving the edgeMin/axX/Y
 handling
  and orientation computations in addLine. This did make it slower
 compared
  to the file you sent me, but only by very, very little. Curves were
  affected the most, and they were only 1% slower. I think we can
 handle
  this, especially since lines were about 1% faster. The code is also
 70
  lines shorter.
 
  The edgeM* members are used only so we don't have to iterate through
 every
  scanline if this is not necessary, and so that we can tell
 PiscesCache
  that the bounding box is smaller than what Renderer is given.
 However, now
  that we keep the bucket list, I think it would be more efficient if
 we
  got rid if EdgeM[in|ax]Y and simply computed the y bounds by looking
 at the
  head and tail of the bucket list.
 
 That makes sense.  We calculate that per-edge anyway so the edgeMy 
 constants are redundant.
 
  Also, perhaps we can keep track of edgeM[in|ax]X using the bounding
 boxes
  of curves, instead of the lines in the flattened curves. This would
 not
  be accurate, but I don't think it would affect rendering. It would
 simply
  result in a few more alpha boxes than necessary. I don't think these
 would
  be too bad, because mostly they're just going to be all 0 so they
 will
  be skipped because getTypicalAlpha() is now implemented.
  How do you think we should handle these 4 variables?
 
 I think this is probably OK, but let me play with it a bit and see
 what 
 I come up with.  For one thing, the extra slop may not be large
 enough 
 to trigger a full tile of 0's - there would have to be 32-pixel
 borders 
 for that to happen.
 
 If we get rid of the redundant edgeMy calculations then we might be
 able 
 to do edgeMx calculations on each edge without losing any
 performance...
 
   ...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-10 Thread Denis Lila
Hi Jim.

 - get rid of edgeMxy in all methods but addLine()
 - addLine computes min/max of first/lastScanline
 - addLine also computes min/max of x1,x2 values
 
 this turned out to be just about the same speed for my FX rendering 
 version (which I believe is more sensitive than the way it is
 integrated 
 into JDK, so it should be even less noticeable in JDK).  It also paved
 the way for a couple of other optimizations that ended up netting
 about 1FPS for my current test case that I use so I'm happy for now.
 The code is a lot simpler now...

I also implemented what you describe and those are exactly my results too.
I implemented my ideas for optimizing edgeM[in|ax]Y too, but it turned out
not to make any difference whatsoever.

I should note that my benchmarks say the performance on horizontal lines has
decreased by 32% compared to the version where we qsorted everything. The
benchmark report says the overall performance has stayed the same because
every test other than horizontal lines is performing better by about 2-6%.

Regards,
Denis.

- Jim Graham james.gra...@oracle.com wrote:

 I ended up going with:
 
   ...jim
 
 On 11/9/2010 3:26 PM, Denis Lila wrote:
  Hi again.
 
  I just thought of this: if we're really concerned about the
 accuracy
  of the edgeMinX edgeMaxX variables, we could find the curves'
  critical points and use them to compute the min/max X values. After
 all,
  we're creating (or rather setting) the Curve objects anyway. This
 isn't
  as fast as using the bounding boxes, but it's close and much more
 accurate.
 
  Regards,
  Denis.
 
  - Denis Liladl...@redhat.com  wrote:
 
  Hi Jim.
 
  All lines generated from a given allegedly monotonic curve are
  recorded with the same or (orientation) value.  But, if the
 curves
  are not truly monotonic then it might be theoretically possible
 to
  generate a line that is backwards with respect to the expected
  orientation.  It
  would then get recorded in the edges array with the wrong
  orientation
  and slope and then rasterization might unravel.
 
  I see. In that case, I think it's a good idea if we don't make
 curves
  monotonic. I already did this, by moving the edgeMin/axX/Y
 handling
  and orientation computations in addLine. This did make it slower
  compared
  to the file you sent me, but only by very, very little. Curves
 were
  affected the most, and they were only 1% slower. I think we can
 handle
  this, especially since lines were about 1% faster. The code is also
 70
  lines shorter.
 
  The edgeM* members are used only so we don't have to iterate
 through
  every
  scanline if this is not necessary, and so that we can tell
 PiscesCache
  that the bounding box is smaller than what Renderer is given.
 However,
  now
  that we keep the bucket list, I think it would be more efficient if
 we
  got rid if EdgeM[in|ax]Y and simply computed the y bounds by
 looking
  at the
  head and tail of the bucket list.
  Also, perhaps we can keep track of edgeM[in|ax]X using the
 bounding
  boxes
  of curves, instead of the lines in the flattened curves. This
 would
  not
  be accurate, but I don't think it would affect rendering. It would
  simply
  result in a few more alpha boxes than necessary. I don't think
 these
  would
  be too bad, because mostly they're just going to be all 0 so they
 will
  be skipped because getTypicalAlpha() is now implemented.
  How do you think we should handle these 4 variables?
 
  Thank you,
  Denis.
 
  - Jim Grahamjames.gra...@oracle.com  wrote:
 
  Hi Denis,
 
  On 11/8/2010 2:39 PM, Denis Lila wrote:
  Finally, I discovered (while testing for other problems) that
 the
  curves are not truly monotonic after slicing them.  I realized
  this
  years ago
  when I was writing my Area code (see sun.awt.geom.Curve) and
 put
  in
  tweaking code to make them monotonic after they were split. 
 They
  are
  never off by more than a few bits, but you can't trust the
 curve
  splitting math to generate purely monotonic segments based on a
 t
  generated by some unrelated math.  Sometimes the truly
 horizontal
  or
  vertical t value requires more precision than a float (or even
 a
  double) can provide and splitting at the highest representable
  float less than
  the t value produces a pair of curves on one side of the hill
 and
  splitting at the next float value (which is greater than the
 true
  t
  value) produces curves on the other side of the hill.  Also,
 when
  the
  curve has been split a few times already, the t values loose
  accuracy
  with each split.  This will all be moot if I can eliminate the
  splitting code from the renderer, but it may also play a factor
  in
  the
  stroke/dash
  code...
 
  Making curves monotonic is only used for optimization purposes,
  so it can't see how it would affect rendering correctness.
 
  Fortunately, the non-monotonicity is limited to a few bits of
  precision
  so this may never generate an errant edge in practice unless
  

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-09 Thread Jim Graham

I ended up going with:

- get rid of edgeMxy in all methods but addLine()
- addLine computes min/max of first/lastScanline
- addLine also computes min/max of x1,x2 values

this turned out to be just about the same speed for my FX rendering 
version (which I believe is more sensitive than the way it is integrated 
into JDK, so it should be even less noticeable in JDK).  It also paved 
the way for a couple of other optimizations that ended up netting about 
1FPS for my current test case that I use so I'm happy for now.


The code is a lot simpler now...

...jim

On 11/9/2010 3:26 PM, Denis Lila wrote:

Hi again.

I just thought of this: if we're really concerned about the accuracy
of the edgeMinX edgeMaxX variables, we could find the curves'
critical points and use them to compute the min/max X values. After all,
we're creating (or rather setting) the Curve objects anyway. This isn't
as fast as using the bounding boxes, but it's close and much more accurate.

Regards,
Denis.

- Denis Liladl...@redhat.com  wrote:


Hi Jim.


All lines generated from a given allegedly monotonic curve are
recorded with the same or (orientation) value.  But, if the curves
are not truly monotonic then it might be theoretically possible to
generate a line that is backwards with respect to the expected

orientation.  It

would then get recorded in the edges array with the wrong

orientation

and slope and then rasterization might unravel.


I see. In that case, I think it's a good idea if we don't make curves
monotonic. I already did this, by moving the edgeMin/axX/Y handling
and orientation computations in addLine. This did make it slower
compared
to the file you sent me, but only by very, very little. Curves were
affected the most, and they were only 1% slower. I think we can handle
this, especially since lines were about 1% faster. The code is also 70
lines shorter.

The edgeM* members are used only so we don't have to iterate through
every
scanline if this is not necessary, and so that we can tell PiscesCache
that the bounding box is smaller than what Renderer is given. However,
now
that we keep the bucket list, I think it would be more efficient if we
got rid if EdgeM[in|ax]Y and simply computed the y bounds by looking
at the
head and tail of the bucket list.
Also, perhaps we can keep track of edgeM[in|ax]X using the bounding
boxes
of curves, instead of the lines in the flattened curves. This would
not
be accurate, but I don't think it would affect rendering. It would
simply
result in a few more alpha boxes than necessary. I don't think these
would
be too bad, because mostly they're just going to be all 0 so they will
be skipped because getTypicalAlpha() is now implemented.
How do you think we should handle these 4 variables?

Thank you,
Denis.

- Jim Grahamjames.gra...@oracle.com  wrote:


Hi Denis,

On 11/8/2010 2:39 PM, Denis Lila wrote:

Finally, I discovered (while testing for other problems) that the
curves are not truly monotonic after slicing them.  I realized

this

years ago

when I was writing my Area code (see sun.awt.geom.Curve) and put

in

tweaking code to make them monotonic after they were split.  They

are

never off by more than a few bits, but you can't trust the curve
splitting math to generate purely monotonic segments based on a t
generated by some unrelated math.  Sometimes the truly horizontal

or

vertical t value requires more precision than a float (or even a
double) can provide and splitting at the highest representable

float less than

the t value produces a pair of curves on one side of the hill and
splitting at the next float value (which is greater than the true

t

value) produces curves on the other side of the hill.  Also, when

the

curve has been split a few times already, the t values loose

accuracy

with each split.  This will all be moot if I can eliminate the
splitting code from the renderer, but it may also play a factor

in

the

stroke/dash
code...


Making curves monotonic is only used for optimization purposes,
so it can't see how it would affect rendering correctness.


Fortunately, the non-monotonicity is limited to a few bits of
precision
so this may never generate an errant edge in practice unless
flattening
gets really fine-grained...

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-08 Thread Jim Graham

On 11/8/2010 6:34 AM, Denis Lila wrote:

Hi Clemens.


I've only followed your discussion with Jim but skipped all the
in-depth discussion.
 From my prior experiences usually  JNI is not woth the trouble, if
you don't have a serious reason why using native code would have
advantages (like the possibility of using SIMD or when value-types
would be a huge benefit), it has its own performance pitfalls
especially if the workload is small and things like Get*ArrayCritical
cause scalability problems because they have to lock the GC.


Well, Jim Graham said that a native version of the engine still runs
a lot faster than the version with all my changes. That's why I thought


Actually, that report is old.  I've now got the new Java version turning 
in double the frame rates of the old native version.



it would be a good idea. Also, when not doing antialiasing we usually
feed paths to native consumers, so I thought if pisces used JNI, we
could reduce the java-C transitions five fold. But then I realized that
with antialiasing the opposite would happen, so I'm not sure whether
JNI is a good idea.


That's a good point that the other rasterizers will end up using this 
stroking engine and they are native.  We can worry about cleaning that 
up later.  JNI might eventually be a good idea, but lets fix the 
algorithm first and then worry about whether it will help this renderer 
or if we can make the interface to the other renderers simpler.


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-08 Thread Jim Graham
It's still a work in progress, but I've cleaned up a lot of logic and 
made it faster in a number of ways.  Note that I've abstracted out the 
cache stuff and created an AlphaConsumer interface which may evolve 
over time.


In FX we actually consume alphas in larger chunks than the code in JDK 
which was driven by Ductus's 32x32 mandate, so I would have had to make 
completely incompatible changes to emitRow - so I moved it behind an 
interface.  For the JDK code, if you want to integrate this version, I 
would have the cache implement the new interface and move your version 
of emitRow into the Cache object.  I'd send you the new code for my 
AlphaConsumer, but it is incompatible with what you need to cache so it 
won't help you.


You'll also need a bit of un-translation cleanup as we have mirrors of 
all of java.awt.geom with special new APIs that FX needs.


...jim

On 11/8/2010 6:40 AM, Denis Lila wrote:

Hi Jim.


Also, I've gotten another 20% improvement out of the design with a few
more tweaks.  (Though I measured the 20% in the stripped down version
that I'm prototyping with FX so I'm not sure how much of that 20%
would show up through the layers of the 2D code.  Overall, I've about
doubled the frame rates of the prototype since your first drop that you
checked in to the OpenJDK repository.)


Can I see your new version?


Attached.


How about looking more at the stroking end of the process and I'll dig
a little more into optimal rasterization code.  I have a lot of
experience with optimizing rasterizer code (and JNI if it comes to that), but
very little with the curve manipulations involved in stroking (math is so
*hard* at my age ;-)...


Sounds good. Have you implemented your idea of processing one pixel row at a
time, as opposed to processing subpixel rows? If not, I could do that.


Not yet.  Right now I've gotten a lot of mileage out of a few tweaks of 
the bookkeeping of the sample-row-at-a-time version.  I'm still mulling 
over exactly how to make that go faster.


...jim
package com.sun.openpisces;

/**
 * @author Flar
 */
public interface AlphaConsumer {
public int getOriginX();
public int getOriginY();
public int getWidth();
public int getHeight();
public void setMaxAlpha(int maxalpha);
public void setAndClearRelativeAlphas(int alphaDeltas[], int y,
  int firstdelta, int lastdelta);
}
package com.sun.openpisces;

import com.sun.javafx.geom.PathConsumer;
import com.sun.javafx.geom.PathIterator;
import com.sun.javafx.geom.Rectangle;
import java.util.Iterator;

/**
 * @author Flar
 */
public final class OpenPiscesRenderer implements PathConsumer {

public static void feedConsumer(PathIterator pi, PathConsumer pc) {
float[] coords = new float[6];
while (!pi.isDone()) {
int type = pi.currentSegment(coords);
switch (type) {
case PathIterator.SEG_MOVETO:
pc.moveTo(coords[0], coords[1]);
break;
case PathIterator.SEG_LINETO:
pc.lineTo(coords[0], coords[1]);
break;
case PathIterator.SEG_QUADTO:
pc.quadTo(coords[0], coords[1],
  coords[2], coords[3]);
break;
case PathIterator.SEG_CUBICTO:
pc.curveTo(coords[0], coords[1],
   coords[2], coords[3],
   coords[4], coords[5]);
break;
case PathIterator.SEG_CLOSE:
pc.closePath();
break;
}
pi.next();
}
pc.pathDone();
}

private final class ScanlineIterator {

private int[] crossings;
private int[] edgePtrs;
private int edgeCount;

// crossing bounds. The bounds are not necessarily tight (the scan line
// at minY, for example, might have no crossings). The x bounds will
// be accumulated as crossings are computed.
private final int maxY;
private int nextY;

private static final int INIT_CROSSINGS_SIZE = 10;

private ScanlineIterator() {
crossings = new int[INIT_CROSSINGS_SIZE];
edgePtrs = new int[INIT_CROSSINGS_SIZE];

// We don't care if we clip some of the line off with ceil, since
// no scan line crossings will be eliminated (in fact, the ceil is
// the y of the first scan line crossing).
nextY = getFirstScanLineCrossing();
maxY = getScanLineCrossingEnd()-1;
}

private int next() {
// TODO: make function that convert from y value to bucket idx?
int cury = nextY++;
int bucket = cury - boundsMinY;
int count = this.edgeCount;
int ptrs[] = this.edgePtrs;
 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-08 Thread Jim Graham

A couple of questions about the code that I haven't touched...

Is there some reason why the AFD for cubics doesn't have any tests for 
dddxy (the constants for its equations), but the AFD for quads is 
testing the ddxy on every loop?  I know that these values do change when 
the AFD variables are doubled or halved, but why does the cubic version 
get away with only testing out to the n-1th order differences but the 
quad version has to test out to the nth order differences?


Also, what is the value of breaking the pieces into monotonic segments 
prior to flattening?  Is it simply amortizing the cost of determining if 
the segment is up or down?  I guess this used to be done because we 
needed monotonic (in Y) curve segments since we did a top-down iteration 
across all segments, but now they aren't in the rasterization loop.  If 
it is simply a performance issue then I may experiment with eliminating 
that stage and seeing if I can make it go faster overall.


Finally, I discovered (while testing for other problems) that the curves 
are not truly monotonic after slicing them.  I realized this years ago 
when I was writing my Area code (see sun.awt.geom.Curve) and put in 
tweaking code to make them monotonic after they were split.  They are 
never off by more than a few bits, but you can't trust the curve 
splitting math to generate purely monotonic segments based on a t 
generated by some unrelated math.  Sometimes the truly horizontal or 
vertical t value requires more precision than a float (or even a double) 
can provide and splitting at the highest representable float less than 
the t value produces a pair of curves on one side of the hill and 
splitting at the next float value (which is greater than the true t 
value) produces curves on the other side of the hill.  Also, when the 
curve has been split a few times already, the t values loose accuracy 
with each split.  This will all be moot if I can eliminate the splitting 
code from the renderer, but it may also play a factor in the stroke/dash 
code...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-02 Thread Denis Lila
Hi Jim.

I implemented a middle ground between what I sent yesterday and
what we have now: using the array of linked lists only to replace
the quicksort (I think this was your original suggestion).

Unfortunately, this didn't work out (at least according to the
benchmarks). Curves were no different than what we used to have,
while the performance on lines (especially horizontal ones) decreased
significantly.

It's not obvious to me why this happened, so I think now I will put
this type of optimization aside and convert to JNI, where profiling
will be easier (for me - I haven't been able to make OProfile work
for java yet).

Regards,
Denis.

- Jim Graham james.gra...@oracle.com wrote:

 Hi Denis,
 
 A generic suggestion - make all of your classes final - that gives the
 
 compiler the maximum flexibility to inline any methods you write.
 
 With respect to the algorithm choices:
 
 I think they key is that the X sorting rarely has any work to do.  The
 
 first test of does this edge need to be swapped with the next lower 
 edge is probably 99.999% guaranteed to be false.  Thus, trying to 
 optimize anything else to simplify swapping is likely a step in the 
 wrong direction.
 
 The casting may be hurting a bit more, and it is hurting on every
 access 
 to an edge.
 
 I'm guessing the best model to use would be to write the code to
 assume 
 no swapping is necessary at all.  Get that code as simple and as fast
 as 
 can be.  Then add as little perturbation of that code to manage
 swapping 
 when it is necessary, and that will likely be the optimal 
 implementation.  I think you could probably even do some benchmarking
 on 
 a path that is carefully tested to process lots of edges without any X
 
 sorting and get that as fast as you can without any swap code, and
 then 
 add the swap code that impacts the performance of that operation as 
 little as possible.  The key is that the swap code have as little
 impact 
 on the performance of the case that never needs any swapping at all 
 first and foremost - then make swapping faster within that
 constraint...
 
   ...jim
 
 On 11/1/10 3:13 PM, Denis Lila wrote:
  Hi Jim.
 
  I implemented your bucket sort idea. I'm not just using the buckets
  to remove the y-sort. I use them in the iteration through the
 scanlines
  too. What happens is that on any iteration, the active list is the
  doubly linked list buckets[nextY-boundsMinY]. I did this because I
 thought
  less memory would need to be moved around compared to when we just
 kept
  the active list pointers in an array. For example, with doubly
 linked
  lists we can implement insertion sort with O(1) writes. With arrays
 we
  have to use O(n) writes. This also allows us to get rid of the the
 edgePtrs
  array.
 
  I ran some benchmarks, and unfortunately I was wrong, somehwere. All
 the tests
  are at least 10% slower than the insertion sort version of what we
 have now.
  I can't immediately see why this is. The only thing I can think of
 is that
  there are a lot of float -  int casts, but then again, I don't know
 how
  slow this operation is. It would be good if it's because of the
 casts because
  it would no longer be an issue when we convert to native.
 
  I alo made an unrelated change: I now find the orientation and
 update x0,y0
  much earlier than we used to. The way I was doing it before was
 silly.
 
  Regards,
  Denis.
 
  - Jim Grahamjames.gra...@oracle.com  wrote:
 
  Hi Denis,
 
  Good news!
 
  On 10/28/2010 3:27 PM, Denis Lila wrote:
  If we moved to a Curve class or some other way to
  consolidate the 3 lists (may be easier in native code), this
 might
  win
  in more cases...
 
  Does that mean you no longer think we should flatten every curve
 as
  soon
  as we get it?
 
  No, I was just discussion the feasibility of that one suggestion
 in
  the
  context of all of the possibilities of whether or not we took the
  other
  choices.  If you think that flattening will pay off then we don't
 have
 
  to worry about the 3 lists.  It was just that when I hacked it in,
 I
  still had your 3 lists and so the pros and cons of horizontal edge
  sorting were relative to that version of the renderer...
 
 ...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-02 Thread Jim Graham

Hi Denis,

I had a bit of luck with the following next() method:

private int next() {
// TODO: make function that convert from y value to bucket idx?
int bucket = nextY - boundsMinY;
for (int ecur = edgeBuckets[bucket]; ecur != NULL; ecur = 
(int)edges[ecur+NEXT]) {

edgePtrs = LilaHelpers.widenArray(edgePtrs, edgeCount, 1);
edgePtrs[edgeCount++] = ecur;
// REMIND: Adjust start Y if necessary
}
int crossingCount = edgeCount;
crossings = LilaHelpers.widenArray(crossings, 0, 
crossingCount);

nextY++;
for (int i = 0; i  edgeCount; i++) {
int ecur = edgePtrs[i];
float curx = edges[ecur+CURX];
int cross = ((int) curx)  1;
edges[ecur+CURX] = curx + edges[ecur+SLOPE];
if (edges[ecur+OR]  0) {
cross |= 1;
}
int j = i;
while (--j = 0) {
int jcross = crossings[j];
if (jcross = cross) {
break;
}
crossings[j+1] = jcross;
edgePtrs[j+1] = edgePtrs[j];
}
crossings[j+1] = cross;
edgePtrs[j+1] = ecur;
}
int newCount = 0;
for (int i = 0; i  edgeCount; i++) {
int ecur = edgePtrs[i];
if (edges[ecur+YMAX]  nextY) {
edgePtrs[newCount++] = ecur;
}
}
edgeCount = newCount;
return crossingCount;
}

This allowed me to:

- delete a lot of the bucket helper functions.
- get rid of the back pointers
- pare an edge down to 5 values (YMAX, CURX, OR, SLOPE, and NEXT)

I also used the following addLine() method:

private void addLine(float x1, float y1, float x2, float y2, int or) {
final int firstCrossing = (int)Math.max(Math.ceil(y1), boundsMinY);
if (firstCrossing = boundsMaxY) {
return;
}
final int ptr = numEdges * SIZEOF_EDGE;
final float slope = (x2 - x1) / (y2 - y1);
edges = LilaHelpers.widenArray(edges, ptr, SIZEOF_EDGE);
numEdges++;
edges[ptr+OR] = or;
edges[ptr+CURX] = x1 + (firstCrossing - y1) * slope;
edges[ptr+SLOPE] = slope;
edges[ptr+YMAX] = y2;
final int bucketIdx = firstCrossing - boundsMinY;
addEdgeToBucket(ptr, bucketIdx);
}

How does that fare for you?

...jim

On 11/2/2010 4:10 PM, Denis Lila wrote:

Hi Jim.

I implemented a middle ground between what I sent yesterday and
what we have now: using the array of linked lists only to replace
the quicksort (I think this was your original suggestion).

Unfortunately, this didn't work out (at least according to the
benchmarks). Curves were no different than what we used to have,
while the performance on lines (especially horizontal ones) decreased
significantly.

It's not obvious to me why this happened, so I think now I will put
this type of optimization aside and convert to JNI, where profiling
will be easier (for me - I haven't been able to make OProfile work
for java yet).

Regards,
Denis.

- Jim Grahamjames.gra...@oracle.com  wrote:


Hi Denis,

A generic suggestion - make all of your classes final - that gives the

compiler the maximum flexibility to inline any methods you write.

With respect to the algorithm choices:

I think they key is that the X sorting rarely has any work to do.  The

first test of does this edge need to be swapped with the next lower
edge is probably 99.999% guaranteed to be false.  Thus, trying to
optimize anything else to simplify swapping is likely a step in the
wrong direction.

The casting may be hurting a bit more, and it is hurting on every
access
to an edge.

I'm guessing the best model to use would be to write the code to
assume
no swapping is necessary at all.  Get that code as simple and as fast
as
can be.  Then add as little perturbation of that code to manage
swapping
when it is necessary, and that will likely be the optimal
implementation.  I think you could probably even do some benchmarking
on
a path that is carefully tested to process lots of edges without any X

sorting and get that as fast as you can without any swap code, and
then
add the swap code that impacts the performance of that operation as
little as possible.  The key is that the swap code have as little
impact
on the performance of the case that never needs any swapping at all
first and foremost - then make swapping faster within that
constraint...

...jim

On 11/1/10 3:13 PM, Denis Lila wrote:

Hi Jim.

I implemented your bucket sort idea. I'm not just using the buckets
to remove the y-sort. I use them in the iteration through the

scanlines

too. What happens is that on any 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-01 Thread Jim Graham

Hi Denis,

A generic suggestion - make all of your classes final - that gives the 
compiler the maximum flexibility to inline any methods you write.


With respect to the algorithm choices:

I think they key is that the X sorting rarely has any work to do.  The 
first test of does this edge need to be swapped with the next lower 
edge is probably 99.999% guaranteed to be false.  Thus, trying to 
optimize anything else to simplify swapping is likely a step in the 
wrong direction.


The casting may be hurting a bit more, and it is hurting on every access 
to an edge.


I'm guessing the best model to use would be to write the code to assume 
no swapping is necessary at all.  Get that code as simple and as fast as 
can be.  Then add as little perturbation of that code to manage swapping 
when it is necessary, and that will likely be the optimal 
implementation.  I think you could probably even do some benchmarking on 
a path that is carefully tested to process lots of edges without any X 
sorting and get that as fast as you can without any swap code, and then 
add the swap code that impacts the performance of that operation as 
little as possible.  The key is that the swap code have as little impact 
on the performance of the case that never needs any swapping at all 
first and foremost - then make swapping faster within that constraint...


...jim

On 11/1/10 3:13 PM, Denis Lila wrote:

Hi Jim.

I implemented your bucket sort idea. I'm not just using the buckets
to remove the y-sort. I use them in the iteration through the scanlines
too. What happens is that on any iteration, the active list is the
doubly linked list buckets[nextY-boundsMinY]. I did this because I thought
less memory would need to be moved around compared to when we just kept
the active list pointers in an array. For example, with doubly linked
lists we can implement insertion sort with O(1) writes. With arrays we
have to use O(n) writes. This also allows us to get rid of the the edgePtrs
array.

I ran some benchmarks, and unfortunately I was wrong, somehwere. All the tests
are at least 10% slower than the insertion sort version of what we have now.
I can't immediately see why this is. The only thing I can think of is that
there are a lot of float -  int casts, but then again, I don't know how
slow this operation is. It would be good if it's because of the casts because
it would no longer be an issue when we convert to native.

I alo made an unrelated change: I now find the orientation and update x0,y0
much earlier than we used to. The way I was doing it before was silly.

Regards,
Denis.

- Jim Grahamjames.gra...@oracle.com  wrote:


Hi Denis,

Good news!

On 10/28/2010 3:27 PM, Denis Lila wrote:

If we moved to a Curve class or some other way to
consolidate the 3 lists (may be easier in native code), this might

win

in more cases...


Does that mean you no longer think we should flatten every curve as

soon

as we get it?


No, I was just discussion the feasibility of that one suggestion in
the
context of all of the possibilities of whether or not we took the
other
choices.  If you think that flattening will pay off then we don't have

to worry about the 3 lists.  It was just that when I hacked it in, I
still had your 3 lists and so the pros and cons of horizontal edge
sorting were relative to that version of the renderer...

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-28 Thread Jim Graham

Hi Denis,

Good news!

On 10/28/2010 3:27 PM, Denis Lila wrote:

If we moved to a Curve class or some other way to
consolidate the 3 lists (may be easier in native code), this might win
in more cases...


Does that mean you no longer think we should flatten every curve as soon
as we get it?


No, I was just discussion the feasibility of that one suggestion in the 
context of all of the possibilities of whether or not we took the other 
choices.  If you think that flattening will pay off then we don't have 
to worry about the 3 lists.  It was just that when I hacked it in, I 
still had your 3 lists and so the pros and cons of horizontal edge 
sorting were relative to that version of the renderer...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-27 Thread Jim Graham

Hi Denis,

On 10/26/2010 6:58 AM, Denis Lila wrote:

90% (guesstimate) of the time edges do not cross each other, thus if
you sort the crossings without reordering the active edges then you just
end up doing the same sorting work (same swaps) on the next scanline.  My
SpanShapeIterator code actually reordered the edges on every sample
line to match their current X coordinates in a way that involved 1 compare
per edge that was processed and only occasionally a swap of 2 edge
pointers.  It would basically eliminate the sort at line 149 at the
cost of only doing a compare in the nextY processing for the very very
common case.


I also implemented this some time ago. I didn't keep it because it slowed
things down a bit. However, I only tested it with very complex and large
paths, and in hindsight, I shouldn't have based my decision on that, so I
will re-implement it.


I tried using this new rasterizer in FX and I had a test case which had 
a few shapes that were essentially zig-zaggy shapes on the top and 
bottom and fills between the zigzags (kind of like a seismic chart with 
fills between the pen squiggles).


When I added a simple sort of the linear edges the performance nearly 
doubled in speed.  Here is the code I replaced your quad-next-edge loop 
with:


for (int i = elo; i  ehi; i++) {
int ptr = edgePtrs[i];
int cross = ((int) edges[ptr+CURX])  1;
if (edges[ptr+OR]  0) {
cross |= 1;
}
edges[ptr+CURY] += 1;
edges[ptr+CURX] += edges[ptr+SLOPE];
int j = crossingIdx;
while (--j = 0) {
int jcross = crossings[j];
if (cross = jcross) {
break;
}
crossings[j+1] = jcross;
edgePtrs[elo+j+1] = edgePtrs[elo+j];
}
crossings[j+1] = cross;
edgePtrs[elo+j+1] = ptr;
crossingIdx++;
}

I then did a conditional sort, moved to right after the qlo-qhi and 
clo-chi loops:


for (int i = qlo; i  qhi; i++) {
// same stuff
}
for (int i = clo; i  chi; i++) {
// same stuff
}
if (qhi  qlo || chi  clo) {
Arrays.sort(crossings, 0, crossingIdx);
}

In the case of my test case where I only had a polygon to fill, the 
performance doubled.  Obviously I didn't do much for the case where 
there were curves because this was just a quick hack to see the value of 
sorting the edges.  If we moved to a Curve class or some other way to 
consolidate the 3 lists (may be easier in native code), this might win 
in more cases...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-26 Thread Denis Lila
Hi Jim.

 Just to be certain - you are still planning on putting the existing 
 stuff back and we're talking about future work, right?  I'd love to
 get a stake in the ground here.

Yes, I'll push today.

 If we are really worried about the y-sort, then how about creating a 
 bunch of buckets and doing a bucket sort of the edges?  As they are 
 added to the list of segments, we accumulate their indices in a row
 list based on their startY so that each step of the next() simply moves
 to the next Y and adds the edges mentioned in the list there.  Some work
 would have to be done on how to manage the storage simply (like a 
 rownext field in the edge structure so that they are linked in a 
 linked list), but then there would be no qsort at all for the cost of
 new int[N_ROWS] and an extra field in every edge.

I like this.

 Perhaps we should work on the algorithms a little more then (I'm
 talking about the numeric stuff, not the memory management stuff since the 
 memory management techniques will differ quite a lot in C code, but 
 better math helps at either level)?

Indeed - especially the dashing.

 90% (guesstimate) of the time edges do not cross each other, thus if
 you sort the crossings without reordering the active edges then you just
 end up doing the same sorting work (same swaps) on the next scanline.  My
 SpanShapeIterator code actually reordered the edges on every sample
 line to match their current X coordinates in a way that involved 1 compare
 per edge that was processed and only occasionally a swap of 2 edge 
 pointers.  It would basically eliminate the sort at line 149 at the
 cost of only doing a compare in the nextY processing for the very very
 common case.

I also implemented this some time ago. I didn't keep it because it slowed
things down a bit. However, I only tested it with very complex and large
paths, and in hindsight, I shouldn't have based my decision on that, so I
will re-implement it.

Regards,
Denis.

- Jim Graham james.gra...@oracle.com wrote:

 Hi Denis,
 
 On 10/25/2010 3:30 PM, Denis Lila wrote:
  - Create a curve class and store an array of those so you don't
 have
  to iterate 3 different arrays of values and use array accesses to
 grab
  the data (each array access is checked for index OOB exceptions).
 
  I actually implemented something like this in my first draft of the
 current
  version of renderer. I didn't stick with it because it was a slower
 than
  even what we have in openjdk6. Granted, my implementation was a bit
 more
  high level than simply making 3 classes to represent lines quads and
 cubics,
  but it seemed pretty hopeless, so I didn't spend any time figuring
 out
  exactly what it was that made it slower.
 
 Hmmm...  Perhaps object allocation overhead was biting us there.  In 
 native code you could cobble this up with batch allocation of space
 and 
 pseudo-object/struct allocation out of the batches.
 
  - Or perform AFD on curves as they come into Renderer, but only
 save
  line segment edges in the edges array.  This would use more
 storage,
  but the iterations of the AFD would happen in a tight loop as the
 data
  comes in rather than having to store all of their AFD data back in
 the quads
 
  I considered this before abandoning the high level version I mention
 above.
  I didn't go with it because, while I am not worried about the higher
 memory
  consumption, I was afraid of the impact that having this many edges
 would
  have on the qsort call and on lines 99-113 and 140-148 in next().
 
 I can see worrying about qsort, but I think one qsort would be 
 inherently faster than 3 qsorts which you have anyway so the
 difference 
 would get lost in the noise.  Also, I'm not sure how the 99 to 113 and
 
 140 to 148 would be affected.  The path will have the same number of 
 crossings per sample row regardless of whether the curves have been 
 flattened or not.  You might be adding and deleting edges from the 
 active list more often, though (in other words, 99 would dump more 
 curves and 140 would take in more curves), but the number of edges or
 
 curves at any given point would not be affected by flattening.  Also,
 
 the way you've written the loops at 99, they have to copy every 
 edge/quad/curve that *doesn't* expire so a skipped curve is actually 
 less work for that loop.  The loops at 140 would have to occasionally
 do 
 more processing, but it is made up for in the fact that 99 does less 
 work and the nextY processing is simpler.
 
  How about this: we change the format of the edges array to be an
 array of
  sequences of edges. So, right now the edges array has this format:
  E* where E represents 6 consecutive floats
 {ymin,ymax,curx,cury,or,slope}.
  I am proposing we change it to
 {n,or}({ymin,ymax,curx,cury,slope})^n.
  lineTo's would add an edge sequence with n=1 to the edges array. If
 a call
  to quadTo or curveTo produced N curves, it would simply put N,or at
 the
  end of the edges array, and then append the data for 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-25 Thread Denis Lila
 That's great. I will be pushing today.

About that: you wrote the TransformingPathConsumer2D file,
so how should you be given credit? Should I put your name in
Contributed-by? Should I put an @author tag in the file?
Or does the reviewed-by suffice?

Regards,
Denis.

- Denis Lila dl...@redhat.com wrote:

 Hi Jim.
 
  How about this:
  
  (Math.abs(len-leftLen)  err*len)
  
  (noting that err*len can be calculated outside of the loop).
 
 This is what I use now.
 
  Note that a custom shape can send segments in any order that it
 wants
  so close,close can happen from a custom shape even if Path2D won't
 do
  it.
 
 Right, of course. For some reason I only considered the test case I
 was
 using. Sorry for slip up.
 
  There is only one question on the board from my perspective - the 
  question about dash length errors at the start of this message. 
 
 I'm ok with the accuracy of dash length computations. My main
 concerns
 right now are dashing performance and AA rendering performance. The
 latter has improved, but not by as much as I was expecting. Also, it
 was
 bothering me that when I removed PiscesCache 1-2 weeks ago
 performance
 got worse.
 It would also be nice to find a method that is guaranteed to find
 all offset curve cusps.
 
  Depending on how you feel about that I think we're ready to go
 
 That's great. I will be pushing today.
 
  (and I have some ideas on further optimizations to consider if you
 are still
  game after this goes in)...
 
 I'd love to hear what they are.
 
 Thank you for all your time,
 Denis.
 
 - Jim Graham james.gra...@oracle.com wrote:
 
  On 10/22/2010 12:22 PM, Denis Lila wrote:
   Because the error is meant to be relative. What I use is supposed
 to
  be
   equivalent to difference/AverageOfCorrectValueAndApproximation 
  err,
   where, in our case,
  AverageOfCorrectValueAndApproximation=(len+leafLen)/2,
   so that multiplication by 2 should have been a division.
   Should I use the less fancy differece/CorrectValue  err and skip
   a division by 2?
  
  If it is relative, then shouldn't it be relative to the desired
  length? 
Why does the computed approximation factor in to the size of the
  error 
  you are looking for?  If you use the average then the amount of
 error
  
  that you will accept will be larger if your estimate is wronger. 
 I
  
  don't think the wrongness of your approximation should have any
  effect 
  on your error.
  
   lines 98-101 - somewhere in the confusion moveToImpl became
  redundant
   with moveTo.
  
   I thought I'd leave these in because they're interface methods,
   and it's usually not a great idea to call these from private
  methods. I've
   removed them anyway, because you said so and because I suppose
   if ever we need to do something to the user input that shouldn't
 be
  done
   to input coming from private methods in the class (such as the
  scaling
   of the input coordinates done in Renderer) we can always easily
   reintroduce them.
  
  I actually thought about the interface concept after I sent that
 and
  was 
  at peace with them, but I'm also at peace with them being gone. ;-)
  
   That's right. I don't think it's what should happen, but it's
 what
  closed
   jre does, so I've left it in (with some changes to make it
 actually
   replicate the behaviour of closed java, since it was buggy - the
  moveTo
   was missing). I don't draw anything on the second close in the
  close,close
   case, since that would look horrible with round joins and square
  caps.
   However, the way path2D's are implemented this safeguard will not
  be
   activated since a closePath() following another closePath() is
  ignored.
  
   I also now initialize the *dxy *mxy variables in moveTo. This
 should
  be an
   inconsequential change, but I've put it in just so the state of
   Stroker after a moveTo is well defined.
  
  New code looks good (even if we think it's a silly side effect of
  closed 
  JDK to have to implement).
  
   line 368 - does this cause a problem if t==1 because lastSegLen
  will
   now return the wrong value?  Maybe move this into an else clause
 on
  the
   t=1 test?
  
   It does cause a problem. I've fixed it by adding a variable that
  keeps track
   of the length of the last returned segment. The way it was being
  done was
   a dirty hack because it assumed that if the method didn't return
 in
  the loop,
   done would remain false.
  
  Woohoo!  I was actually bothered by the side-effecting in the old
 code
  - 
  this is a better approach.
  
   I made one more change in dasher: in somethingTo I removed the
 long
  comment
   near the end, and I handle the case where phase == dash[idx]
  immediately.
   I do this for consistency with lineTo. The only instances where
 this
  makes
   a difference is when we have a path that starts with a dash, ends
 at
  exactly
   the end of a dash, and has a closePath. So something like
  move(100,0),
   

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-25 Thread Jim Graham

Hi Denis,

On 10/25/2010 7:34 AM, Denis Lila wrote:

(and I have some ideas on further optimizations to consider if you are still
game after this goes in)...


I'd love to hear what they are.


Here are my thoughts:

- Currently Renderer has more stages than we probably should have:
for (each full pixel row) {
for (each subpixel row) {
for (each curve on subpixel row) {
convert curve into crossing
crossing is (subpixelx:31 + dir:1)
}
sort crossings for subpixel row
apply wind rule and convert crossings into alpha deltas
}
convert alpha deltas into run lengths
}
for (each tile) {
convert run lengths into alphas
}
I'm thinking we should be able to get rid of a couple of those stages...

- One alternate process would be:
(all work is done in the tail end of the cache itself)
for (each full pixel row) {
for (each curve on full pixel row) {
convert curve into N subpixel crossings
(subpixelx:28 + subpixelfracty:3 + dir:1)
}
}
sort all crossings for entire pixel row
maybe collapse raw crossings into modified accumulated crossings
record start of row in index array
for (each tile) {
convert raw or collapsed crossings directly into alphas
}
Note that we could simply leave the raw crossings in the cache and then 
process them in the tile generator, but that would require more storage 
in the cache since a given path would tend to have 8 different entries 
as it crossed every subpixel row.  If we collapse them, then we can sum 
the entries for a given x coordinate into a single entry and store:

(pixelx:25 + coveragedelta:7)
where the coverage delta is a signed quantity up to N_SUBPIX^2 so it 
uses the same storage we needed for the subpixel parts of both X and Y 
plus the direction bit.  It likely needs a paired value in the next x 
pixel location just like our current alpha deltas needs as well.  If 
we wanted to optimize that then we could devote one more bit to the 
next pixel will consume all of the remainder of the N^2 coverage, but 
there would be cases where that would not be true (such as if the pixel 
row has only partial vertical coverage by the path).  It's probably 
simpler to just have deltas for every pixel.


The storage here would likely be similar to the storage used for the 
current cache since the current RLE cache uses 2 full ints to store a 
coverage and a count.  And in cases where we have one pixel taking up 
partial coverage and the following pixel taking up the remainder of the 
full coverage then we have 4 ints, but the crossing delta system would 
only have 2 ints.


Other thoughts...

- Create a curve class and store an array of those so you don't have to 
iterate 3 different arrays of values and use array accesses to grab the 
data (each array access is checked for index OOB exceptions).


- Or perform AFD on curves as they come into Renderer, but only save 
line segment edges in the edges array.  This would use more storage, but 
the iterations of the AFD would happen in a tight loop as the data comes 
in rather than having to store all of their AFD data back in the quads 
and curves arrays and then reload the data for every sub-pixel step. 
Renderer still takes curves, it just breaks them down immediately rather 
than on the fly.  If there are only a small number of edges per curve 
then the storage might not be that much worse because the quad and curve 
arrays already store more values than the edge array.


- Convert to native.  Note that we use a native version of the pisces 
that you started with to do some rendering in FX.  I tried porting to 
use your new (Java) renderer in FX and performance went down even though 
you show it to be faster than what was there before.  So, your Java 
renderer compares favorably to the old Java pisces, but both compare 
unfavorably to the old native pisces.  Maybe we should convert your code 
to native and see if that gives us a performance boost.  It's nice to 
use pure Java, but there is a lot of shoehorning of data going on here 
that could be done much more easily and naturally in native code.


How is that for food for thought?

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-25 Thread Denis Lila
Hi Jim.

 - Create a curve class and store an array of those so you don't have
 to iterate 3 different arrays of values and use array accesses to grab
 the data (each array access is checked for index OOB exceptions).

I actually implemented something like this in my first draft of the current
version of renderer. I didn't stick with it because it was a slower than
even what we have in openjdk6. Granted, my implementation was a bit more
high level than simply making 3 classes to represent lines quads and cubics,
but it seemed pretty hopeless, so I didn't spend any time figuring out
exactly what it was that made it slower.

 - Or perform AFD on curves as they come into Renderer, but only save 
 line segment edges in the edges array.  This would use more storage,
 but the iterations of the AFD would happen in a tight loop as the data
 comes in rather than having to store all of their AFD data back in the quads

I considered this before abandoning the high level version I mention above.
I didn't go with it because, while I am not worried about the higher memory
consumption, I was afraid of the impact that having this many edges would
have on the qsort call and on lines 99-113 and 140-148 in next().
How about this: we change the format of the edges array to be an array of
sequences of edges. So, right now the edges array has this format:
E* where E represents 6 consecutive floats {ymin,ymax,curx,cury,or,slope}.
I am proposing we change it to {n,or}({ymin,ymax,curx,cury,slope})^n.
lineTo's would add an edge sequence with n=1 to the edges array. If a call
to quadTo or curveTo produced N curves, it would simply put N,or at the
end of the edges array, and then append the data for the N produced edges.
I think this would give us the best of both worlds, in that we can do all
the AFD iterations in a tight loop close to the input methods and it
doesn't present any problems with respect to sorting or managing the
active list. It can probably be implemented completely transparently with
respect to ScanlineIterator. The details of
the implementation involve an interesting speed/memory trade off, but we
can discuss that later.

 - Convert to native.  Note that we use a native version of the pisces
 that you started with to do some rendering in FX.  I tried porting to
 use your new (Java) renderer in FX and performance went down even
 though you show it to be faster than what was there before.  So, your Java 
 renderer compares favorably to the old Java pisces, but both compare 
 unfavorably to the old native pisces.  Maybe we should convert your
 code to native and see if that gives us a performance boost.  It's nice to
 use pure Java, but there is a lot of shoehorning of data going on
 here that could be done much more easily and naturally in native code.

I've been wanting to do this for a very long time. C and C++ are more
convenient for this type of work. I didn't because I've never used the JNI
before. I guess this is as good a time to learn as any. I'd still like to
keep the debugging of native code to a minimum, so we should implement
as much as possible in java before starting on this.


I still need to think some more about your other suggestion. I'll reply
to it tomorrow.

 How is that for food for thought?
Delicious :)

Regards,
Denis.

- Jim Graham james.gra...@oracle.com wrote:

 Hi Denis,
 
 On 10/25/2010 7:34 AM, Denis Lila wrote:
  (and I have some ideas on further optimizations to consider if you
 are still
  game after this goes in)...
 
  I'd love to hear what they are.
 
 Here are my thoughts:
 
 - Currently Renderer has more stages than we probably should have:
  for (each full pixel row) {
  for (each subpixel row) {
  for (each curve on subpixel row) {
  convert curve into crossing
  crossing is (subpixelx:31 + dir:1)
  }
  sort crossings for subpixel row
  apply wind rule and convert crossings into alpha deltas
  }
  convert alpha deltas into run lengths
  }
  for (each tile) {
  convert run lengths into alphas
  }
 I'm thinking we should be able to get rid of a couple of those
 stages...
 
 - One alternate process would be:
  (all work is done in the tail end of the cache itself)
  for (each full pixel row) {
  for (each curve on full pixel row) {
  convert curve into N subpixel crossings
  (subpixelx:28 + subpixelfracty:3 + dir:1)
  }
  }
  sort all crossings for entire pixel row
  maybe collapse raw crossings into modified accumulated crossings
  record start of row in index array
  for (each tile) {
  convert raw or collapsed crossings directly into alphas
  }
 Note that we could simply leave the raw crossings in the cache and
 then 
 process them in the tile generator, but that would require more
 storage 
 in the cache since a given path would tend to have 8 different 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-25 Thread Jim Graham

Hi Denis,

Just to be certain - you are still planning on putting the existing 
stuff back and we're talking about future work, right?  I'd love to get 
a stake in the ground here.


On 10/25/2010 3:30 PM, Denis Lila wrote:

- Create a curve class and store an array of those so you don't have
to iterate 3 different arrays of values and use array accesses to grab
the data (each array access is checked for index OOB exceptions).


I actually implemented something like this in my first draft of the current
version of renderer. I didn't stick with it because it was a slower than
even what we have in openjdk6. Granted, my implementation was a bit more
high level than simply making 3 classes to represent lines quads and cubics,
but it seemed pretty hopeless, so I didn't spend any time figuring out
exactly what it was that made it slower.


Hmmm...  Perhaps object allocation overhead was biting us there.  In 
native code you could cobble this up with batch allocation of space and 
pseudo-object/struct allocation out of the batches.



- Or perform AFD on curves as they come into Renderer, but only save
line segment edges in the edges array.  This would use more storage,
but the iterations of the AFD would happen in a tight loop as the data
comes in rather than having to store all of their AFD data back in the quads


I considered this before abandoning the high level version I mention above.
I didn't go with it because, while I am not worried about the higher memory
consumption, I was afraid of the impact that having this many edges would
have on the qsort call and on lines 99-113 and 140-148 in next().


I can see worrying about qsort, but I think one qsort would be 
inherently faster than 3 qsorts which you have anyway so the difference 
would get lost in the noise.  Also, I'm not sure how the 99 to 113 and 
140 to 148 would be affected.  The path will have the same number of 
crossings per sample row regardless of whether the curves have been 
flattened or not.  You might be adding and deleting edges from the 
active list more often, though (in other words, 99 would dump more 
curves and 140 would take in more curves), but the number of edges or 
curves at any given point would not be affected by flattening.  Also, 
the way you've written the loops at 99, they have to copy every 
edge/quad/curve that *doesn't* expire so a skipped curve is actually 
less work for that loop.  The loops at 140 would have to occasionally do 
more processing, but it is made up for in the fact that 99 does less 
work and the nextY processing is simpler.



How about this: we change the format of the edges array to be an array of
sequences of edges. So, right now the edges array has this format:
E* where E represents 6 consecutive floats {ymin,ymax,curx,cury,or,slope}.
I am proposing we change it to {n,or}({ymin,ymax,curx,cury,slope})^n.
lineTo's would add an edge sequence with n=1 to the edges array. If a call
to quadTo or curveTo produced N curves, it would simply put N,or at the
end of the edges array, and then append the data for the N produced edges.
I think this would give us the best of both worlds, in that we can do all
the AFD iterations in a tight loop close to the input methods and it
doesn't present any problems with respect to sorting or managing the
active list. It can probably be implemented completely transparently with
respect to ScanlineIterator. The details of
the implementation involve an interesting speed/memory trade off, but we
can discuss that later.


I think this might be overkill since sorts tend to have logN behavior so 
doubling the number of edges would not double the time taken in the 
sort.  Also, I would think that the sort would be a small amount of time 
compared to the rest of the processing, wasn't it?


If we are really worried about the y-sort, then how about creating a 
bunch of buckets and doing a bucket sort of the edges?  As they are 
added to the list of segments, we accumulate their indices in a row list 
based on their startY so that each step of the next() simply moves to 
the next Y and adds the edges mentioned in the list there.  Some work 
would have to be done on how to manage the storage simply (like a 
rownext field in the edge structure so that they are linked in a 
linked list), but then there would be no qsort at all for the cost of 
new int[N_ROWS] and an extra field in every edge.



- Convert to native.  Note that we use a native version of the pisces
that you started with to do some rendering in FX.  I tried porting to
use your new (Java) renderer in FX and performance went down even
though you show it to be faster than what was there before.  So, your Java
renderer compares favorably to the old Java pisces, but both compare
unfavorably to the old native pisces.  Maybe we should convert your
code to native and see if that gives us a performance boost.  It's nice to
use pure Java, but there is a lot of shoehorning of data going on
here that could be done much more easily 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-22 Thread Denis Lila
Hi Jim.

 I was going to run these today, but fixing the dashing bug above and
 rerunning the tests took a while and it's already 8:30 pm here and I
 have to go home. I'll run them tomorrow morning.

I ran the benchmarks. I've attached the options file. I ran benchmarks
of my icedtea installation, closed source java, openjdk7 without the
webrev, and openjdk7 with the webrev.

The results files are at http://icedtea.classpath.org/~dlila/benchResults/
I think the names are pretty self explanatory.

The html comparisons are:
jdk6 vs latest work:
http://icedtea.classpath.org/~dlila/benchResults/JDK6vsLatest_html/Summary_Report.html

closed source vs latest work:
http://icedtea.classpath.org/~dlila/benchResults/SUNvsLatest_html/Summary_Report.html

and most importantly:
previous version of pisces in openjdk7 vs latest work:
http://icedtea.classpath.org/~dlila/benchResults/PrevVsLatest_html/Summary_Report.html

The improvements are significant. Running J2DAnalyzer on all the results files 
with
jdk6Bench.res as the basis produces the following summary:

Summary:
  jdk6Bench: 
Number of tests:  104
Overall average:  30.7986576862
Best spread:  0.0% variance
Worst spread: 10.96% variance
(Basis for results comparison)

  sunBench: 
Number of tests:  104
Overall average:  276654.4443479696
Best spread:  0.25% variance
Worst spread: 19.28% variance
Comparison to basis:
  Best result:  6488.34% of basis
  Worst result: 43.74% of basis
  Number of wins:   80
  Number of ties:   2
  Number of losses: 22

  prevPisces: 
Number of tests:  104
Overall average:  221539.3516605794
Best spread:  0.08% variance
Worst spread: 5.54% variance
Comparison to basis:
  Best result:  350.33% of basis
  Worst result: 55.0% of basis
  Number of wins:   57
  Number of ties:   10
  Number of losses: 37

  latestPisces: 
Number of tests:  104
Overall average:  226762.64157611743
Best spread:  0.0% variance
Worst spread: 3.03% variance
Comparison to basis:
  Best result:  501.86% of basis
  Worst result: 26.23% of basis
  Number of wins:   72
  Number of ties:   4
  Number of losses: 28


But unfortunately, if you look at the individual test cases in the html
reports there are also some stepbacks, most notably in the dashing of
anything that isn't a straight line. In fact, the results of drawOval
show a 50%-500% improvement in non dashed drawing, and a 50%-25% deterioration
in dashed drawing. I was expecting this, after the binary search algorithm.
I've been thinking it might be better if we just go with the old algorithm
and simply use Dasher.LengthIterator to flatten the curves and feed the lines
to lineTo. Or we could just go with what we have and hope we find a better
algorithm.

Regards,
Denis.


piscesBench.opt
Description: Binary data


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-21 Thread Denis Lila
Hi Jim.

 In the meantime, can you outline the tests that you ran?

I ran Java2D without any problems. There's also been an
icedtea bug http://icedtea.classpath.org/bugzilla/show_bug.cgi?id=450
related to a look and feel (tinylaf: 
http://www.muntjak.de/hans/java/tinylaf/tinylaf-1_4_0.zip
to run a demo, download, extract and run tinycp.jar) that wasn't being
painted properly. Now it looks good (I think there's no LCD text antialiasing, 
but
that's a different story).

I also wrote rendered at least 5 random cubic curves. I didn't
inspect these too closely - this test was just to find any obvious
errors. There weren't any.

Most importantly, I also ran this test suite:
http://icedtea.classpath.org/hg/gfx-test/

It generates a lot of images using a reference java implementation and
an implementation to be tested. I used closed source java as a reference,
and I ran it using a fresh checkout of the 2d branch of openjdk7:
http://icedtea.classpath.org/~dlila/prevPiscesSungfx/results/

and a fresh checkout of the 2d branch with my patch applied:
http://icedtea.classpath.org/~dlila/latestPiscesSungfx/

As you can see, the number of images that are the same as the reference
implementation has increased subtantially. It has also increased
consistenly across all categories.
I've gone through most of the images, and the ones that are different
aren't dramatically different. The differences are barely noticeable.
The improvement is even greater when compared to openjdk6 which produces
some pretty scary test results, especially in the scaled rectangles
category (but I haven't uploaded these tests so you'll just have to trust
me on this ;-)  )

NOTE: the webrev has changed slightly since the last e-mail I sent. The 
gfx test suite revealed a bug in the drawing of dashed rectangles, so
to fix it I have changed a line in Dasher.closePath from if(needsMoveTo) {
to if(needsMoveTo || !dashOn) {

 Also, have you used J2DBench at all?  I know you ran some of your own
 benchmarks, but didn't know if you were familiar with this tool:
 
   {OpenJDK}/src/share/demo/java2d/J2DBench/

I was going to run these today, but fixing the dashing bug above and
rerunning the tests took a while and it's already 8:30 pm here and I have to
go home. I'll run them tomorrow morning.

Regards,
Denis.

- Jim Graham james.gra...@oracle.com wrote:

 Hi Denis,
 
 I'll be focusing on this later today, just a last proofread.
 

 
   ...jim
 
 On 10/21/2010 12:27 PM, Denis Lila wrote:
  Good to push?
 
  http://icedtea.classpath.org/~dlila/webrevs/noflatten2/webrev/


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-21 Thread Jim Graham

Hi Denis,

I saw something in the latest webrev that reminded me of an earlier comment.

On 10/18/2010 2:21 PM, Denis Lila wrote:

line 389 - The test here is different from closePath.  What if they
were both prev == DRAWING_OP_TO?


I am now using prev!=DRAWING_OP_TO (not ==, since it is supposed to execute
finish() if no nonzero length lines have been fed to Stroker yet). In fact
I have removed the started variable since it's equivalent to 
prev==DRAWING_OP_TO.


It looks like closePath still uses a different test than moveTo and 
pathDone.  They all test for DRAWING_OP_TO, but closepath uses != 
whereas the others use ==.  Is that right?


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-20 Thread Jim Graham

Hi Denis,

One clarification:

On 10/20/10 7:11 AM, Denis Lila wrote:

When would the isCW test trigger?  Does it track rev?  What happens
at 180 degrees (is that test reliable for the randomization that might
happen when omxy are directly opposite mxy)?


isCw is used for computing the arc bisector by testing whether the computed
point is on the side it should be (and multiplying by -1 if not), it is used
to compute the sign of cv in drawBezApproxForArc, and for computing rev.


The only reason I ask is
because I think the sign of mmxy is probably controllable by
understanding the input conditions, but this test should be safe
(modulo if it really works at 180 degrees).  If it has failure modes at 180
degrees then reworking the math to produce the right sign in the first
place may be more robust for that case.  A test for this is to render
(0,0) -  (100,0) -  (0,0) with round caps and then rotate it through
360 degrees and see if the round caps invert at various angles.


I already did that. I drew 100 lines like the one you describe. I attached
the results. It never fails. It is still possible that there could be some
case where it fails, but this does prove that such a case would be very rare.


Also, line 256 - does that track rev?


It does. I changed the test to if(rev).


Cool, but above I was also asking the same question about line 231, and 
you provided a lot of information about line 231 (and a test to verify 
it), but didn't answer if the test in line 231 also tracks rev the same 
way...?


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-20 Thread Jim Graham

On 10/20/10 7:54 AM, Denis Lila wrote:

In #2, you have a bunch of I'() || B'() which I read as the slope
of the derivative (i.e. acceleration) is equal, don't you really mean
I() || B() which would mean the original curves should be parallel?
Otherwise you could say I'() == B'(), but I think you want to show
parallels because that shows how you can use the dxy1,dxy4 values as
the parallel equivalents.


Not really. I've updated the comment explaining what || does, and
it should be clearer now. Basically, A(t) || B(t) means that vectors
A(t) and B(t) are parallel (i.e. A(t) = c*B(t), for some nonzero t),
not that curves A and B are parallel at t.


I'm not sure we are on the same page here.

I'() is usually the symbol indicating the derivative of I().  My issue 
is not with the || operator, but with the fact that you are applying it 
to the I'() instead of I().


Also, how is A(t) and B(t) are parallel not the same as the curves A 
and B are parallel at t?


Also, A(t) = c*B(t) is always true for all A and B and all t if you take 
a sample in isolation.  Parallel means something like A(t) = c*B(t) 
with the same value of c for some interval around t, not that the 
values at t can be expressed as a multiple.


Again, I'() || B'() says to me that the derivative curves are parallel, 
not that the original curves are parallel...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-20 Thread Denis Lila
 Cool, but above I was also asking the same question about line 231,
 and you provided a lot of information about line 231 (and a test to verify
 it), but didn't answer if the test in line 231 also tracks rev the
 same way...?

Oh, no, line 231 isn't mean to be related to rev at all. It just checks
to see on which side of the (omx,omy),(mx,my) chord the computed (mmx, mmy)
is.

Regards,
Denis.

- Jim Graham james.gra...@oracle.com wrote:

 Hi Denis,
 
 One clarification:
 
 On 10/20/10 7:11 AM, Denis Lila wrote:
  When would the isCW test trigger?  Does it track rev?  What
 happens
  at 180 degrees (is that test reliable for the randomization that
 might
  happen when omxy are directly opposite mxy)?
 
  isCw is used for computing the arc bisector by testing whether the
 computed
  point is on the side it should be (and multiplying by -1 if not), it
 is used
  to compute the sign of cv in drawBezApproxForArc, and for computing
 rev.
 
  The only reason I ask is
  because I think the sign of mmxy is probably controllable by
  understanding the input conditions, but this test should be safe
  (modulo if it really works at 180 degrees).  If it has failure
 modes at 180
  degrees then reworking the math to produce the right sign in the
 first
  place may be more robust for that case.  A test for this is to
 render
  (0,0) -  (100,0) -  (0,0) with round caps and then rotate it
 through
  360 degrees and see if the round caps invert at various angles.
 
  I already did that. I drew 100 lines like the one you describe. I
 attached
  the results. It never fails. It is still possible that there could
 be some
  case where it fails, but this does prove that such a case would be
 very rare.
 
  Also, line 256 - does that track rev?
 
  It does. I changed the test to if(rev).
 
   ...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-20 Thread Denis Lila
 Also, how is A(t) and B(t) are parallel not the same as the curves A
 and B are parallel at t?

Well, suppose A and B are lines with endpoints (0,0), (2,0) for A
and (0,1),(2,1) for B. Obviously, for all t, A and B are parallel at t.
However let t = 0.5. Then A(t) = (1,0) and B(t) = (1, 1). The vectors
(1,0) and (1,1) are not parallel, so saying A(t) || B(t) is the same
as saying that there exists c such that (1,0) = c*(1,1), which isn't true.

However, A'(t)=(2,0) and B'(t)=(2,0), and the vectors 
(2,0) and (2,0) are parallel.

Does this make more sense?

Regards,
Denis.

- Jim Graham james.gra...@oracle.com wrote:

 On 10/20/10 7:54 AM, Denis Lila wrote:
  In #2, you have a bunch of I'() || B'() which I read as the
 slope
  of the derivative (i.e. acceleration) is equal, don't you really
 mean
  I() || B() which would mean the original curves should be
 parallel?
  Otherwise you could say I'() == B'(), but I think you want to
 show
  parallels because that shows how you can use the dxy1,dxy4 values
 as
  the parallel equivalents.
 
  Not really. I've updated the comment explaining what || does, and
  it should be clearer now. Basically, A(t) || B(t) means that
 vectors
  A(t) and B(t) are parallel (i.e. A(t) = c*B(t), for some nonzero
 t),
  not that curves A and B are parallel at t.
 
 I'm not sure we are on the same page here.
 
 I'() is usually the symbol indicating the derivative of I().  My
 issue 
 is not with the || operator, but with the fact that you are applying
 it 
 to the I'() instead of I().
 
 Also, A(t) = c*B(t) is always true for all A and B and all t if you
 take 
 a sample in isolation.  Parallel means something like A(t) = c*B(t) 
 with the same value of c for some interval around t, not that the 
 values at t can be expressed as a multiple.
 
 Again, I'() || B'() says to me that the derivative curves are
 parallel, 
 not that the original curves are parallel...
 
   ...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-20 Thread Jim Graham
Right, but it seemed to me that if omxy was the from vector and mxy 
was the to vector, that the computed mmxy should always be predictably 
on the same side of it, no?  If it was on the wrong side then it 
wouldn't be a random occurence, it must be related to the input data. 
So either it is always on the right side, always on the wrong side (i.e. 
just reverse the rotation in the math), or always on the right/wrong 
side depending on the CWness of the join angle - which would be 
reflected in rev...  No?


...jim

On 10/20/10 10:29 AM, Denis Lila wrote:

Cool, but above I was also asking the same question about line 231,
and you provided a lot of information about line 231 (and a test to verify
it), but didn't answer if the test in line 231 also tracks rev the
same way...?


Oh, no, line 231 isn't mean to be related to rev at all. It just checks
to see on which side of the (omx,omy),(mx,my) chord the computed (mmx, mmy)
is.

Regards,
Denis.

- Jim Grahamjames.gra...@oracle.com  wrote:


Hi Denis,

One clarification:

On 10/20/10 7:11 AM, Denis Lila wrote:

When would the isCW test trigger?  Does it track rev?  What

happens

at 180 degrees (is that test reliable for the randomization that

might

happen when omxy are directly opposite mxy)?


isCw is used for computing the arc bisector by testing whether the

computed

point is on the side it should be (and multiplying by -1 if not), it

is used

to compute the sign of cv in drawBezApproxForArc, and for computing

rev.



The only reason I ask is
because I think the sign of mmxy is probably controllable by
understanding the input conditions, but this test should be safe
(modulo if it really works at 180 degrees).  If it has failure

modes at 180

degrees then reworking the math to produce the right sign in the

first

place may be more robust for that case.  A test for this is to

render

(0,0) -   (100,0) -   (0,0) with round caps and then rotate it

through

360 degrees and see if the round caps invert at various angles.


I already did that. I drew 100 lines like the one you describe. I

attached

the results. It never fails. It is still possible that there could

be some

case where it fails, but this does prove that such a case would be

very rare.



Also, line 256 - does that track rev?


It does. I changed the test to if(rev).


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-20 Thread Jim Graham
OK, I can see how your terminology works now, but it seems odd to me.  I 
never consider re-expressing the coordinates on a curve as a vector and 
basing geometric properties on those constructed vectors.  I either 
consider the points on the curve, or its tangent or its normal - none of 
which is the value you are expressing.  You are, essentially, operating 
on tangent vectors, but you aren't calling them that, you are calling 
them something like the vector of the derivative which is a relative 
(direction only) version of the tangent vector (which has location and 
direction).  When one talks about curves and being parallel, my mind 
tends to think of the tangents of the curves being parallel and tangents 
are directed by the first derivative.


Also, if you are going to use your definition of vector then parallel 
is an odd term to use for values that originate from the same point 
(points considered as a vector are taken to originate from 0,0) - really 
you want those vectors to be collinear, not (just) parallel.


So, either || means the coordinates of the curves expressed as vectors 
are collinear or it means the curves (i.e. the tangents of the curve 
at the indicated point) are parallel.  Saying vector I() is parallel 
to vector B() didn't really have meaning to me based on the above biases.


So, I get your comment now and all of the math makes sense, but the 
terminology seemed foreign to me...


...jim

On 10/20/10 10:48 AM, Denis Lila wrote:

Also, how is A(t) and B(t) are parallel not the same as the curves A
and B are parallel at t?


Well, suppose A and B are lines with endpoints (0,0), (2,0) for A
and (0,1),(2,1) for B. Obviously, for all t, A and B are parallel at t.
However let t = 0.5. Then A(t) = (1,0) and B(t) = (1, 1). The vectors
(1,0) and (1,1) are not parallel, so saying A(t) || B(t) is the same
as saying that there exists c such that (1,0) = c*(1,1), which isn't true.

However, A'(t)=(2,0) and B'(t)=(2,0), and the vectors
(2,0) and (2,0) are parallel.

Does this make more sense?

Regards,
Denis.

- Jim Grahamjames.gra...@oracle.com  wrote:


On 10/20/10 7:54 AM, Denis Lila wrote:

In #2, you have a bunch of I'() || B'() which I read as the

slope

of the derivative (i.e. acceleration) is equal, don't you really

mean

I() || B() which would mean the original curves should be

parallel?

Otherwise you could say I'() == B'(), but I think you want to

show

parallels because that shows how you can use the dxy1,dxy4 values

as

the parallel equivalents.


Not really. I've updated the comment explaining what || does, and
it should be clearer now. Basically, A(t) || B(t) means that

vectors

A(t) and B(t) are parallel (i.e. A(t) = c*B(t), for some nonzero

t),

not that curves A and B are parallel at t.


I'm not sure we are on the same page here.

I'() is usually the symbol indicating the derivative of I().  My
issue
is not with the || operator, but with the fact that you are applying
it
to the I'() instead of I().

Also, A(t) = c*B(t) is always true for all A and B and all t if you
take
a sample in isolation.  Parallel means something like A(t) = c*B(t)
with the same value of c for some interval around t, not that the
values at t can be expressed as a multiple.

Again, I'() || B'() says to me that the derivative curves are
parallel,
not that the original curves are parallel...

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-19 Thread Denis Lila
Hi Jim.

If I haven't replied to a suggestion, that means I've implemented and
I thought it was a good idea, so I don't have anything to say about it.

 line 238: If X0,Y0,XL,COUNT were bumped up by 1 then you could just 
 reuse SLOPE from the linear indices - just a thought.

True, but I would like to preserve the naming differences. CURSLOPE
makes it clear that the slope will change.

 lines 521,527,533: Why are these executed twice?  You call these
 methods again after the initialize common fields code.  That seems like
 double the work just to maybe save 4 lines of shared code?  Maybe put the 4 
 lines of shared code in a helper function that all of the init()
 methods call?

Wow, I can't believe these slipped past me. What happened was that I used to
initialize the type specific fields first, to avoid having 2 switch statements.
However, that didn't work out (for reasons explained in the comment), so I 
needed
2 switches after all. I guess I just forgot to delete the init* calls in the 
first
one. I'm pretty sure that the init* calls in the first switch can just be 
deleted.
In fact, it might be a bug to leave them there, since init* calls the AFD 
iteration
function. If we have 2 init calls, the AFD function will be called twice, so 
this
is probably a bug.

 line 37: If it can't be instantiated, why does it take arguments?

Another mistake when cutting, pasting, and modifying old code.

 getTransformedPoints isn't used?
 getUntransformedPoints isn't used?
 fillWithIndxes(float) isn't used?
 evalQuad isn't used?  (Though it does provide symmetry with evalCubic
 which is used)
 getFlatness* aren't used?
 ptSegDistSq isn't used?

Should I get rid of these? I wanted to do it, but I wanted to wait until
just before pushing because I was afraid I'd start needing them again at
some point in the future.

 line 105: There is a closed form solution to cubic roots.  I 
 unfortunately used a bad version in CubicCurve2D.solveCubic and I
 don't remember if I ever went back and fixed it (it may even have been 
 Cardano's method for all I know).  There are versions out there
 which do work better.  The problem with the one in CC2D was that I copied it
 out of Numerical Recipes in C and apparently the author somehow
 assumed that all cubics would have 1 or 3 roots, but a cubic of the form 
 (x-a)(x-a)(x-b) has 2 roots.  D'oh!  While I did find other 
 implementations out there on the net, in the end fixing the closed
 form solution seemed wrought with issues since many of the tests would use
 radically different approaches depending on tiny changes in one of the
 intermediate results and so I worried about FP error even in doubles 
 possibly skewing the results.  I think you should leave your code in 
 there, but I wanted to fill you in on other possibities.

I looked around for a closed form solution but I only found variations
of the one on wikipedia. I decided not to implement it because it looked
like I was going to have to deal with complex numbers and I didn't want
to do that. It also didn't seem like it would be a lot faster than Newton's
method which actually does very few iterations (4-6 if I remember correctly).
But I'll keep this in mind, and if I find a good implementation of a closed
form formula, I'll use it.

 line 566: shouldn't horizontal lines be ignored?  they don't affect 
 rasterization.

They don't, but it's important to always update the current position, otherwise,
if we get something like: moveto(0,0),lineTo(100,0), lineTo(100,100), instead
of recording a vertical line from 100,0 to 100,100 we would record a diagonal
line from 0,0 to 100,100. The actual ignoring is done in the six lines following
these two.


The link is still
http://icedtea.classpath.org/~dlila/webrevs/noflatten2/webrev/
I thoroughly tested it yet, but Java2DDemo looks good.

Thanks,
Denis.

- Jim Graham james.gra...@oracle.com wrote:

 Round 3...
 
 On 10/6/2010 1:36 PM, Denis Lila wrote:
 
  webrev:
  http://icedtea.classpath.org/~dlila/webrevs/noflatten/webrev/
 
 I'm going to set the rest of Stroker.java aside for a moment and focus
 
 on other areas where I have some better knowledge...
 
 Renderer.java:
 
 lines 83, 91, 99: can't these be folded into the prior loops?  You can
 
 update their Y while searching for the [eqc]hi value.
 
 lines 178,192: indent to the preceding (?  (Also, I'm a big fan of 
 moving the { to a line by itself if an conditional or argument list
 
 was wrapped to more than 1 line - the 2D team tends to use that style
 
 everywhere in the 2D code...)
 
 line 193: add fieldForCmp here instead of every time in the loop? 
 (The 
 compiler will probably/hopefully do that anyway)
 
 line 574: indentation?
 
 line 612: delete?  Or will this be making a comeback sometime?
 
 lines 624,626: indentation?
 
 lines 724,725: doesn't the assert false omit these?  I usually throw
 an 
 InternalError in cases like this with a description of what went
 wrong.
 
 I've read up through the use of the cache in 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-19 Thread Jim Graham

On 10/19/2010 10:38 AM, Denis Lila wrote:

Hi Jim.

If I haven't replied to a suggestion, that means I've implemented and
I thought it was a good idea, so I don't have anything to say about it.


That's mostly true too for me, but there are a couple that I might go 
back to - I'll let you know when I think I've reached a 100% coverage 
(getting close).



getTransformedPoints isn't used?
getUntransformedPoints isn't used?
fillWithIndxes(float) isn't used?
evalQuad isn't used?  (Though it does provide symmetry with evalCubic
which is used)
getFlatness* aren't used?
ptSegDistSq isn't used?


Should I get rid of these? I wanted to do it, but I wanted to wait until
just before pushing because I was afraid I'd start needing them again at
some point in the future.


OK, use your best judgment.  If they are small and they add to symmetry 
of services (like evalQuad) or they might be used later, then it isn't a 
big deal, but dead code in private APIs shouldn't be just left laying 
around if we can help it.


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-19 Thread Jim Graham

Hi Denis,

On 10/19/2010 10:40 AM, Denis Lila wrote:

ROCsq - I looked at the wikipedia article and it wasn't clear how it
directly related to your function since the article is dealing with
the curvature of a function graphed against its own t, but you are dealing
with 2 parametric equations combined and graphed against each other.
I think I'm going to have to just trust you on this one for now.  :-(


http://en.wikipedia.org/wiki/Radius_of_curvature_%28applications%29
Did you look at the above wikipedia article? When researching I came across
2 of them, and one of them only mentions natural parameterizations, but
the above has the general equation for a R-Rn function, then below that
they have the special case where n=2, x(t)=t, and y(t)=f(t). I used the
first equation on that page.
Actually, I wrote a simple program to make sure the radius of curvature
function was correct. I have attached it. It's not a proof, but I think
it is convincing. Just hold down the mouse button and move it horizontally.
This will change the t value on the curve and the circle drawn will have
radius equal to Math.sqrt(ROCsq). You can also change the control points of
the curve. There's a bug where when you run it the window is really tiny, so
you have to manually resize it and maximize it.


I actually did read that article, but I wasn't seeing the fact that it 
was a multiple parametric equation and that the || were distance 
calculations rather than simply absolute values.  Now I see it. 
Plugging those concepts in to the first equation the mapping is very direct.


One thing that confused me when I was proof-reading it was that the 
numerator seemed to be dx2dy2 squared when it should be cubed.  Then I 
spotted the final *dx2dy2 term at the end which makes it cubed.  I 
wasn't sure why you isolated that term out there instead of just 
grouping it with the rest of the numerator - is there a danger of 
overflow if you multiply it before you do the division?  If so, then 
that is fine since it doesn't actually affect the number of fp ops so it 
should be the same performance.



lines 621-626 and 643-646 - I'm not sure what the derivation of these
lines are.  I tried to do my own equations, but I seem to be heading
in a different direction than you used and it didn't seem like I was
going to converge.  What equations did you set up to solve to get these
calculations?  From what I can see we have:
- new p1,p4 are calculated
- new p(0.5) is calculated
- the corresponding dx,dy at t=0,0.5,1 (gives slopes)
- slopes at t=0, 0.5 and 1 should be the same for all curves
and what you are trying to compute are two scaling constants c1 and c2
that represent how much to scale the dx1,dy1 and dx4,dy4 values to get
a curve that interpolates both position and slope at t=0.5.  A comment
might help here...  :-(


I see how (dxm,dym) was confusing. The only reason for computing the slope
at t=0.5 is to get the point of the offset curve at t=0.5. We don't make
the computed curve and the input curve have equal slopes at t=0.5 because
this would give us an overdetermined system. What we're trying to do in this
function is to approximate an ideal offset curve (call it I) of the input
curve B using a bezier curve Bp. The constraints I use to get the equations are:


It does help *a lot*, though, so thank you for writing it up.  I would 
move it closer to the code in question since the function has such a 
long preamble that separates the comment from the code that implements 
it (also method comments are usually reserved for API documentation 
purposes).


lines 544,559 - I'd remove the line numbers from the comment.  They are 
already wrong and they won't survive any more edits any better.  ;-)


In #2, you have a bunch of I'() || B'() which I read as the slope of 
the derivative (i.e. acceleration) is equal, don't you really mean I() 
|| B() which would mean the original curves should be parallel? 
Otherwise you could say I'() == B'(), but I think you want to show 
parallels because that shows how you can use the dxy1,dxy4 values as the 
parallel equivalents.


I would rename det43 to invdet43 to indicate that it is the inverse of 
the determinant.  I kept looking at it and thinking he has the 
determinant in the wrong side until I noticed that it was in the 
denominator of det43 (which is hard to read in parenthesized C-math).


One side note.  At first glance I would have thought that the final 
equations would have subtracted the c2*dxy4 terms rather than adding 
them (since dxy4 represent p4-p3, not p3-p4 and so the linear 
interpolation equation looks backwards), but that isn't true because 
you did all of your math looking to find the c2 that belongs in this 
equation (as backwards as it seems) and so you got that answer. 
Interestingly if you look at the effect on the results if you calculate 
the dxy4 terms the other way around, they are simply negated and the 
impact would be that c1 would be unaffected (both num and 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-18 Thread Denis Lila
 Are you happy with the current variable names?

Not really. The names you suggested are much better. I'm using them now.
As for making a vector class, I think we should push this and then decide.
It's absence has already done most of the damage it could possibly do, so
it's not an urgent matter. And besides, pushing a good version of this first
will make it easier to determine the performance impact of the vector class.

 line 208 - isn't this just side = false since side is either 0 or
 1?
 Also, side is only ever 1 for an end cap in which case we need exactly
 2 90 degree beziers which are very simple to compute and could be hard 
 coded.  Was there a reason not to just have a special roundCap 
 function which would be 2 hardcoded and fast emitCurveTo calls?  The 
 code would be something like:
  curveTo(/*x+mx,y+my,*/
  x+mx-C*my, y+my+C*mx,
  x-my+C*mx, y+mx+C*my,
  x-my,  y+mx);
  curveTo(/*x-my,y+mx,*/
  x-my-C*mx, y+mx-C*my,
  x-mx-C*my, y-my+C*mx,
  x-mx,  y-my);
 where C = 0.5522847498307933;
 // Computed btan constant for 90 degree arcs
 
 (rest of drawRoundJoin method - it may take some doing, but eventually
 this method should simplify based on: there will only ever be 1 or 2 
 curves, Math.sin(Math.atan2()) cancels out as does 
 Math.cos(Math.atan2()) though to do so requires Math.hypot() which is
 a simpler calculation than any of the transcendentals.  So, if there was
 an easy test for acute/obtuse angle and a way to find the center of
 an angle (both I'm sure we could find on the net), then we could
 eliminate the transcendentals from this method).

I introduced a drawRoundCap method. This eliminated the side argument from
the round join drawing, which made it easier to eliminate the trig function
calls. I did this by using dot products to compute cosines (which was possible
because now Stroker takes only untransformed paths, and all lineWidths are the
same), and I used the double angle identities to compute any sines.
I came up with my own ways of detecting acute/obtuse angles and finding the 
centres
of angles (my own meaning I didn't look at any websites), and they consist of:
1. if (omx * mx + omy * my) = 0 then the angle is acute (line 200).
2. I explain this in a comment in the file (line 208).

 I would combine the emit*To methods into just the one version that
 takes a boolean.  The number of times you call them without the boolean are
 few and far between and the versions that don't take the boolean are
 so few lines of code that inlining them into the boolean versions of the
 methods will still make for nice and tight code.

I was tempted to do this. I didn't because the boolean versions will need
absolute coordinates, while the non boolean ones require relative ones. So
if the non boolean versions need to be called and all we have are the boolean
ones, 2 dummy arguments need to be supplied. However, I've looked at the code
more closesly, and it turns out that we only use the non boolean versions
from inside the boolean versions, so I've followed your suggestion (except
on emitLineTo, since the non boolean version of that is used quite a bit).

 line 374 - why is this moveto here?  Doesn't this break the joined
 path into 2 separate paths?  Should this be a lineto?

It does break it into 2 separate paths, but that's the correct behaviour
in this case. Mathematically speaking, the 2 offset curves are spearate
curves (despite any intersections). This changes when we use caps, but 
when using closePath(), caps aren't drawn so we ishould/i have 2 separate
paths. This is also the behaviour of oracle's closed source java (which
can be seen in one of the Java2Ddemo demos - the one that draws the offset
curves of a star with a vertical slider controlling the line width).

 (Also, sx0==x0 and sy0==y0 at this point).

I am now using s*0 instead of *0, since the expressions involve sdx and sdy,
so it's a bit clearer.

 line 389 - The test here is different from closePath.  What if they
 were both prev == DRAWING_OP_TO?

I am now using prev!=DRAWING_OP_TO (not ==, since it is supposed to execute
finish() if no nonzero length lines have been fed to Stroker yet). In fact
I have removed the started variable since it's equivalent to 
prev==DRAWING_OP_TO.

 line 337 - shouldn't this just return?  I don't think that empty lines
 should modify the path at all.  If this is a case of moveto(x,y); 
 lineto(x,y) then the finish() code should deal with the path that 
 never went anywhere - i.e. drawing a dot, shouldn't it?  The only 
 problem is that moveTo never set up omx,omy so finish will likely draw
 something random.  Perhaps if moveto (and closepath) simply set up 
 omx,omy to w,0 (or 0,-w if you prefer) then finish would do a
 reasonable thing for empty paths?

The reason I made it the way it is is to match what proprietary java
does. If one tries to draw a path like moveTo(0,0);lineTo(100,-100);

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-18 Thread Jim Graham

Hi Denis,

Looks like some great new work here!  I'll try to keep the pie in the 
sky suggestions down now so we can get this in soon...


On 10/18/2010 2:19 PM, Denis Lila wrote:

Hi Jim.


I'm just now getting down to the nitty gritty of your webrevs (sigh).


Thanks. I hope it's not too bad.


The code was great - what sucked was all of the cobwebs on my trig and 
curve math neurons.



PiscesRenderingEngine.java:
line 296 - is there an epsilon that we can use?  == with floating
point often fails with calculations.


I was thinking maybe something more like the ULP stuff you did in one of 
the other files.  I don't think 2 non-equal fp values can be subtracted 
and produce a value that is as small as MIN_VALUE unless you are 
subtracting 2 extremely tiny numbers.



line 338 - null here too


If this is now line 341 you still use at which might be a non-null 
identity transform.  I'd just use null as some shapes might try to do 
some work if they get a non-null identity transform, but null pretty 
much tells them it's identity.



I turned LengthComputer into an iterator. I think it's much cleaner now.
There's no longer any of that scale every t in the array so that they become
valid parameters of the right subdivided curve.
It also uses less memory - just limit+1 curves. I guess I am clever enough ;)
(though unfortunately not clever enough to have thought of the idea myself).


Interesting solution.  I like it.

line 248,251 - I thought it was a bug that you used 2 when I thought you 
should use 0, but it turns out that it doesn't matter because the last 
point of left is the first point of right.  So, I'm not sure why you use 
2, but it isn't a bug.  However...


You only need the array to be 8+6 if you take advantage of that shared 
point and store the 2 halves at 0..type and type-2 .. 2*type-2. 
Just a thought.  No real bug here.



I found a problem with Dashing though. Curves like moveTo(0,0);
curveTo(498,498,499,499,500,500); are not handled well at all.
http://icedtea.classpath.org/~dlila/webrevs/noflatten2/webrev/ is the link
with the new webrev. I have fixed this problem by doing binary search on
the results of the flattening. I really don't like this solution because
it does *a lot* more subdivisions than just flattening.


Ah, I get it now.  Hmmm.  We can leave it for now, but I'm pretty sure 
we can detect cases like this because the sides of the control polygon 
are not relatively equal and only do the recursion if the control 
polygon indicates some amount of acceleration is happening.  Leave it 
for now and make a mental note of this for later.  Also, if there is 
acceleration then I think you could just solve either the X or the Y 
cubic for the necessary point (xs = interp(x0,x1,len/leafLen), solve for 
xs).


One simplification to your binary search - since we know the length is 
relatively close to chord length, just compute the point on the curve at 
t and then use the distance formula to the start point to compute the 
arc length - no subdividing needed, just an eval and a linelen, and 
bsBuf goes away...


...jim


Regards,
Denis.

- Jim Grahamjames.gra...@oracle.com  wrote:


HI Denis,

On 10/6/2010 1:36 PM, Denis Lila wrote:


webrev:
http://icedtea.classpath.org/~dlila/webrevs/noflatten/webrev/


TransformingPolyIOHelper should be in its own file - we consider more
than one class per file to be bad form these days, especially if the
class is used outside of that file.

I'm a little troubled by how the PolyIOHelper fits into the design.
It's odd to talk to the same object for both input and output.  I have
some ideas there, but I think I'll leave it for a followon email and
effort.

Dasher.java:





 boolean needsMoveto;

in moveTo and pathDone:
  if (firstSegBuf is not empty) {
  output moveto(sx,sy)
  output firstSegs
  }
  needsMoveto = true;  // not needed in pathDone

in goTo() {
  if (ON) {
  if (starting) {
  store it in firstSegBuf
  } else {
  if (needsMoveto) {
  output moveto(x0,y0);
  needsMoveto = false;
  }
  send it to output
  }
  } else {
  starting = false;
  needsMoveto = true;
  // nothing goes to output
  }
}

and in ClosePath:
  lineToImpl(sx, sy, LINE);
  if (firstSegBuf is not empty) {
  if (!ON) {  // Or if (needsMoveto)
  output moveTo(sx, sy)
  }
  output firstSegs
  }

I don't see a need for firstDashOn or fullCurve


Stroker.java:

line 129 - proof that miterLimit does not need to be scaled... ;-)

I'm going to send this buffer of comments off now and continue on with

Stroker.java in a future email...

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-18 Thread Jim Graham

Hi Denis,

On 10/18/2010 2:21 PM, Denis Lila wrote:

Are you happy with the current variable names?


Not really. The names you suggested are much better. I'm using them now.
As for making a vector class, I think we should push this and then decide.
It's absence has already done most of the damage it could possibly do, so
it's not an urgent matter. And besides, pushing a good version of this first
will make it easier to determine the performance impact of the vector class.


Woohoo!  Note comment needs updating at line 90.


I introduced a drawRoundCap method. This eliminated the side argument from
the round join drawing, which made it easier to eliminate the trig function
calls. I did this by using dot products to compute cosines (which was possible
because now Stroker takes only untransformed paths, and all lineWidths are the
same), and I used the double angle identities to compute any sines.
I came up with my own ways of detecting acute/obtuse angles and finding the 
centres
of angles (my own meaning I didn't look at any websites), and they consist of:
1. if (omx * mx + omy * my)= 0 then the angle is acute (line 200).
2. I explain this in a comment in the file (line 208).


Yay.  And I can't believe you got that much mileage out of that one 
change.  Cool!  I'll verify the math tomorrow (it doesn't look hard, but 
I'm almost out of here).



I was tempted to do this. I didn't because the boolean versions will need
absolute coordinates, while the non boolean ones require relative ones. So
if the non boolean versions need to be called and all we have are the boolean
ones, 2 dummy arguments need to be supplied. However, I've looked at the code
more closesly, and it turns out that we only use the non boolean versions
from inside the boolean versions, so I've followed your suggestion (except
on emitLineTo, since the non boolean version of that is used quite a bit).


OK, no problem.


line 374 - why is this moveto here?  Doesn't this break the joined
path into 2 separate paths?  Should this be a lineto?


It does break it into 2 separate paths, but that's the correct behaviour
in this case. Mathematically speaking, the 2 offset curves are spearate
curves (despite any intersections). This changes when we use caps, but
when using closePath(), caps aren't drawn so weishould/i  have 2 separate
paths. This is also the behaviour of oracle's closed source java (which
can be seen in one of the Java2Ddemo demos - the one that draws the offset
curves of a star with a vertical slider controlling the line width).


Oh, duh!  I get it.  I had been looking at Dasher all day before that 
and so I was thinking of this in terms of connecting the last dash to 
the first which would, of course, be one continuous path, but this is 
Stroker so if you get a close then it has an inner and outer path. 
Sorry for the distraction.



line 389 - The test here is different from closePath.  What if they
were both prev == DRAWING_OP_TO?


I am now using prev!=DRAWING_OP_TO (not ==, since it is supposed to execute
finish() if no nonzero length lines have been fed to Stroker yet). In fact
I have removed the started variable since it's equivalent to 
prev==DRAWING_OP_TO.


Interesting.  I'll have to trace this later, but it sounds good.


line 337 - shouldn't this just return?  I don't think that empty lines
should modify the path at all.  If this is a case of moveto(x,y);
lineto(x,y) then the finish() code should deal with the path that
never went anywhere - i.e. drawing a dot, shouldn't it?  The only
problem is that moveTo never set up omx,omy so finish will likely draw
something random.  Perhaps if moveto (and closepath) simply set up
omx,omy to w,0 (or 0,-w if you prefer) then finish would do a
reasonable thing for empty paths?


The reason I made it the way it is is to match what proprietary java
does. If one tries to draw a path like moveTo(0,0);lineTo(100,-100);
lineTo(100,-100);lineTo(0,-200); instead of ignoring the second lineTo(100,-100)
it will instead behave as if it were something like 
lineTo(100.1,-100.1),
and it will draw the join. Of course, just because proprietary java does it, it
doesn't mean it's the right thing to do. So, should I still make it ignore 
segments
of 0 length?


No, let me think about this some more.  Compatible is a good default for 
now until we understand it fully so let's not derail for that.



line 283 - doesn't this simplify to?:
 t = x10p*(y0-y0p) - y10p*(x0-x0p)
(source: http://local.wasp.uwa.edu.au/~pbourke/geometry/lineline2d/)
then calculating:
 t = (...)/den;
can amortize the dividend from the following 2 calculations.


I am using this t calculation now. I don't see how what I had simplified
into this though. This is makes me think we were using a wrong equation, which 
is
puzzling since I didn't notice any problems with drawing miters or quadratic 
beziers.
Well, maybe I just made some mistake in trying to show they're equivalent. It 
doesn't
matter.


No, actually they 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-15 Thread Jim Graham

Round 4...

On 10/6/2010 1:36 PM, Denis Lila wrote:


webrev:
http://icedtea.classpath.org/~dlila/webrevs/noflatten/webrev/


BezCurve.java:

I'd add some set() methods to BezCurve/Curve and then use a scratch 
instance in Renderer (and other places?) to reuse those calculations, 
such as:


Curve constructors (obviously)
Renderer.curveOrQuadTo()
Renderer.initQuad()
Renderer.initCurve()
Stroker.findSubdivPoints()

lines 179,182 - using your d* variables, wouldn't these be:
   a = 2*(dax*dax + day*day)
   b = 3*(dax*dbx + day*dby)
   c = 2*(dax*cx + day*cy) + dbx*dbx + dby*dby
   d = dbx*cx + dby*cy
It has fewer multiplies and more of the multipliers are powers of 2 
which are faster than non-power-of-2 multiplies.


line 206,210 - a nit - it didn't really confuse me, but halfway through 
reading this it occurs to me that these are really t0 and t1, right?


line 212 - if x0 (t0?) is 0 then you don't need to include it in the 
roots, no?


line 249,257 - these corrections are exponential compared to the sample 
code on the wikipedia page - was that the slight modification that you 
mentioned in the comments?


line 303 - isn't it enough to just look at the previous T value (or keep 
a running prevt variable) rather than update every T value in the 
array?  Isn't this enough?

int prevt = 0; /* field in Iterator */
next() {
curt = Ts[next];
split = (curt - prevt) / (1 - prevt);
prevt = curt;
}

ROCsq - I looked at the wikipedia article and it wasn't clear how it 
directly related to your function since the article is dealing with the 
curvature of a function graphed against its own t, but you are dealing 
with 2 parametric equations combined and graphed against each other.  I 
think I'm going to have to just trust you on this one for now.  :-(


Done with BezCurve.java...

Stroker.java:

lines 558 (et al) - create a helper function for all of these 
(degenerates to a line) cases?


lines 621-626 and 643-646 - I'm not sure what the derivation of these 
lines are.  I tried to do my own equations, but I seem to be heading in 
a different direction than you used and it didn't seem like I was going 
to converge.  What equations did you set up to solve to get these 
calculations?  From what I can see we have:

  - new p1,p4 are calculated
  - new p(0.5) is calculated
  - the corresponding dx,dy at t=0,0.5,1 (gives slopes)
  - slopes at t=0, 0.5 and 1 should be the same for all curves
and what you are trying to compute are two scaling constants c1 and c2 
that represent how much to scale the dx1,dy1 and dx4,dy4 values to get a 
curve that interpolates both position and slope at t=0.5.  A comment 
might help here...  :-(


line 687 - dup?

line 856 - use a scratch Curve instance and set methods to reduce GC?

line 857 - this is true if the first vector is parallel to either axis, 
but the comment after it says only parallel to the x axis - is that a 
problem?  Also, if both are 0 then no parallel constraint is applied 
even if it does start out parallel.  I imagine that this is all OK 
because the both 0 case should be rare/non-existant and the y-axis 
case will also benefit from the same optimization...?


lines 861-863:  sin/cos and atan2 cancel each other out.  It is faster 
to compute the hypotenuse and then divide the x and y by the hypotenuse 
to get the cos and sin.  (cos = x/hypot; sin = y/hypot;)


Cache and TileGenerator look ok...

I think I'm done at this point except for not understanding the 
parallel cubic equations I mentioned above and the ROCsq method...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-14 Thread Jim Graham

Round 3...

On 10/6/2010 1:36 PM, Denis Lila wrote:


webrev:
http://icedtea.classpath.org/~dlila/webrevs/noflatten/webrev/


I'm going to set the rest of Stroker.java aside for a moment and focus 
on other areas where I have some better knowledge...


Renderer.java:

lines 83, 91, 99: can't these be folded into the prior loops?  You can 
update their Y while searching for the [eqc]hi value.


lines 178,192: indent to the preceding (?  (Also, I'm a big fan of 
moving the { to a line by itself if an conditional or argument list 
was wrapped to more than 1 line - the 2D team tends to use that style 
everywhere in the 2D code...)


line 193: add fieldForCmp here instead of every time in the loop?  (The 
compiler will probably/hopefully do that anyway)


line 238: If X0,Y0,XL,COUNT were bumped up by 1 then you could just 
reuse SLOPE from the linear indices - just a thought.


lines 521,527,533: Why are these executed twice?  You call these methods 
again after the initialize common fields code.  That seems like double 
the work just to maybe save 4 lines of shared code?  Maybe put the 4 
lines of shared code in a helper function that all of the init() methods 
call?


line 574: indentation?

line 566: shouldn't horizontal lines be ignored?  they don't affect 
rasterization.


line 612: delete?  Or will this be making a comeback sometime?

lines 624,626: indentation?

lines 724,725: doesn't the assert false omit these?  I usually throw an 
InternalError in cases like this with a description of what went wrong.


I've read up through the use of the cache in emitRow().  I'll continue 
with reviewing the cache in the next set, meanwhile I also took a look 
at the helper classes...


Helpers.java:

line 37: If it can't be instantiated, why does it take arguments?

getTransformedPoints isn't used?
getUntransformedPoints isn't used?
fillWithIndxes(float) isn't used?
evalQuad isn't used?  (Though it does provide symmetry with evalCubic 
which is used)

getFlatness* aren't used?
ptSegDistSq isn't used?

line 105: There is a closed form solution to cubic roots.  I 
unfortunately used a bad version in CubicCurve2D.solveCubic and I don't 
remember if I ever went back and fixed it (it may even have been 
Cardano's method for all I know).  There are versions out there which 
do work better.  The problem with the one in CC2D was that I copied it 
out of Numerical Recipes in C and apparently the author somehow assumed 
that all cubics would have 1 or 3 roots, but a cubic of the form 
(x-a)(x-a)(x-b) has 2 roots.  D'oh!  While I did find other 
implementations out there on the net, in the end fixing the closed form 
solution seemed wrought with issues since many of the tests would use 
radically different approaches depending on tiny changes in one of the 
intermediate results and so I worried about FP error even in doubles 
possibly skewing the results.  I think you should leave your code in 
there, but I wanted to fill you in on other possibities.


BezCurve.java:

Didn't you get a complaint that this class is defined in a file of the 
wrong name?  Maybe the compiler doesn't complain because the class isn't 
public, but one of the names should change to match.


line 59: I'd throw an internal error and the compiler would be appeased.

line 35: If you make this a create factory then it can leverage the 
code in the existing constructors - just a thought.


I'll stop here and hit send - not much left after this round...

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-13 Thread Jim Graham

HI Denis,

I'm just now getting down to the nitty gritty of your webrevs (sigh).

On 10/6/2010 1:36 PM, Denis Lila wrote:


webrev:
http://icedtea.classpath.org/~dlila/webrevs/noflatten/webrev/


PiscesRenderingEngine.java:
line 278 - the det calculation is missing b.

line 296 - is there an epsilon that we can use?  == with floating 
point often fails with calculations.


line 308 - miterlimit is a ratio of lengths and should not need to be 
scaled.


line 332 - I think you can use a null transform for the same result.

line 338 - null here too

TransformingPolyIOHelper should be in its own file - we consider more 
than one class per file to be bad form these days, especially if the 
class is used outside of that file.


I'm a little troubled by how the PolyIOHelper fits into the design. 
It's odd to talk to the same object for both input and output.  I have 
some ideas there, but I think I'll leave it for a followon email and effort.


Dasher.java:

lines 110,111 - shouldn't you check if there are any first segments 
before writing the extra move?


lines 150-152 - starting should be left true until you reach the end of 
the dash, no?  Otherwise you only hold back the starting segments up 
until the first piece of a curve.  Everything should be held back 
until you reach an off piece.  I don't think the logic for these 
variables is right yet.  Here is what I see:


   boolean needsMoveto;

in moveTo and pathDone:
if (firstSegBuf is not empty) {
output moveto(sx,sy)
output firstSegs
}
needsMoveto = true;  // not needed in pathDone

in goTo() {
if (ON) {
if (starting) {
store it in firstSegBuf
} else {
if (needsMoveto) {
output moveto(x0,y0);
needsMoveto = false;
}
send it to output
}
} else {
starting = false;
needsMoveto = true;
// nothing goes to output
}
}

and in ClosePath:
lineToImpl(sx, sy, LINE);
if (firstSegBuf is not empty) {
if (!ON) {  // Or if (needsMoveto)
output moveTo(sx, sy)
}
output firstSegs
}

I don't see a need for firstDashOn or fullCurve

line 228 - Lazy allocate lc?  Polygons, rectangles, and lines won't need 
it to be dashed (though dashing is already expensive enough it might not 
be noticeable, still waste is waste and there is only one piece of code 
that uses lc so it is easy to protect with a lazy allocation)


line 235 - is there a reason to output a curve of 0 length?  lines of 0 
length are omitted...


line 257 - shouldn't the left and right indices always be at 0 and 
type-curCurveoff?  It looks like after the first time through this loop 
you are storing the right half on top of the left half (see line 262)? 
That would output odd values if any curve gets subdivided more than 
once, though, right?


line 289 - LenComputer always allocates maxcurves segements which is 
8*1024 words (32K) + object overhead * 1024 + 2 more arrays of 1025 
words.  That's a lot of memory for the worst case scenario.  It might be 
nice to come back to this and have it be more dynamic (and it is more of 
a push to have the lc variable be lazily allocated above).  Also, if 
you are clever enough, you never need storage for more than about 10 
(maybe 11) curves even if you produce 1024 t's and len's - due to the 
recursive nature of the subdivision that leaves one half dormant while 
the other half is explored.


line 347,352 - it might be cleaner to have the calling function keep 
track of how far into the curve you are and communicate this to the 
method so it doesn't have to clobber its data based on an assumption of 
how the calling function is structured.  But, this arrangement works 
fine for the current purposes and you do have a TODO comment in there 
about this.


Stroker.java:

line 129 - proof that miterLimit does not need to be scaled... ;-)

I'm going to send this buffer of comments off now and continue on with 
Stroker.java in a future email...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-13 Thread Jim Graham

Round 2

On 10/13/2010 3:40 PM, Jim Graham wrote:

HI Denis,

I'm just now getting down to the nitty gritty of your webrevs (sigh).

On 10/6/2010 1:36 PM, Denis Lila wrote:


webrev:
http://icedtea.classpath.org/~dlila/webrevs/noflatten/webrev/


Stroker.java:

Are you happy with the current variable names?  You're doing the bulk of 
the work now so if they make sense to you now then it might be best to 
leave them alone, but they give me headaches trying to figure them out. 
 I think you are right that it might help to create some vector 
helper classes.  I eventually got used to the naming by the time I was 
done with the file, but yikes - this will hurt the next guy that comes 
along to maintain the code.

The sx0,sy0,sdx,sdy variables are (reasonably) well named.
The x0,y0,pdx,pdy variables aren't consistent.  Perhaps cx0,cy0,cdx,cdy 
for current would make more sense?
The mx0,my0,omx,omy variables are even further from the prior naming 
conventions, what about smx,smy,cmx,cmy?


I would combine the emit*To methods into just the one version that takes 
a boolean.  The number of times you call them without the boolean are 
few and far between and the versions that don't take the boolean are so 
few lines of code that inlining them into the boolean versions of the 
methods will still make for nice and tight code.


line 208 - isn't this just side = false since side is either 0 or 1?
Also, side is only ever 1 for an end cap in which case we need exactly 2 
90 degree beziers which are very simple to compute and could be hard 
coded.  Was there a reason not to just have a special roundCap 
function which would be 2 hardcoded and fast emitCurveTo calls?  The 
code would be something like:

curveTo(/*x+mx,y+my,*/
x+mx-C*my, y+my+C*mx,
x-my+C*mx, y+mx+C*my,
x-my,  y+mx);
curveTo(/*x-my,y+mx,*/
x-my-C*mx, y+mx-C*my,
x-mx-C*my, y-my+C*mx,
x-mx,  y-my);
where C = 0.5522847498307933;
// Computed btan constant for 90 degree arcs

(rest of drawRoundJoin method - it may take some doing, but eventually 
this method should simplify based on: there will only ever be 1 or 2 
curves, Math.sin(Math.atan2()) cancels out as does 
Math.cos(Math.atan2()) though to do so requires Math.hypot() which is a 
simpler calculation than any of the transcendentals.  So, if there was 
an easy test for acute/obtuse angle and a way to find the center of an 
angle (both I'm sure we could find on the net), then we could eliminate 
the transcendentals from this method).


line 283 - doesn't this simplify to?:
   t = x10p*(y0-y0p) - y10p*(x0-x0p)
(source: http://local.wasp.uwa.edu.au/~pbourke/geometry/lineline2d/)
then calculating:
   t = (...)/den;
can amortize the dividend from the following 2 calculations.

line 337 - shouldn't this just return?  I don't think that empty lines 
should modify the path at all.  If this is a case of moveto(x,y); 
lineto(x,y) then the finish() code should deal with the path that 
never went anywhere - i.e. drawing a dot, shouldn't it?  The only 
problem is that moveTo never set up omx,omy so finish will likely draw 
something random.  Perhaps if moveto (and closepath) simply set up 
omx,omy to w,0 (or 0,-w if you prefer) then finish would do a reasonable 
thing for empty paths?


line 374 - why is this moveto here?  Doesn't this break the joined path 
into 2 separate paths?  Should this be a lineto?  (Also, sx0==x0 and 
sy0==y0 at this point).


line 389 - The test here is different from closePath.  What if they were 
both prev == DRAWING_OP_TO?


line 394 - or prev = CLOSE to match the initial state?  (I guess it 
shouldn't really matter unless an upstream feeder has a bug.)


line 486 - this leaves the current point in a different place than line 
510, does that matter?


My head started spinning when evaluating the parallel curve methods so 
I'll stop here for now...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-12 Thread Denis Lila
Hi Jim.

  2. I changed how the alpha map is managed in PiscesTileGenerator to
  something that's a bit clearer and uses less memory (the latter comes
  from changing the +300 in the alpha tile allocation to +1. If there was
  a good reason for using 300, please tell me).
 
 Did I do that?  Wow.  I wish I knew.  There were probably some bugs in
 the alpha accumulation at some point.  Since it was indexed by a byte, I 
 find it hard to believe that it would need 300 entries of padding.

I don't know who did it. I didn't mean to imply I thought it was you.
In hindsight, the wording of If there was a good reason for using 300,
please tell me was pretty terrible. I only asked because 300 seemed like a
very out of place number and I thought it was a bugfix, but I couldn't see
for what bug, so I thought you might know since you've helped me out in
this sort of situation before (i.e. sx0, sy0 in Dasher).

 One thing - will we ever need more than one alpha map in practice?  I 
 don't believe we will since it depends on the maxalpha from the Renderer 
 which is a fixed value.  So, the hashmap cache is probably overkill 
 compared to just seeing if the existing one is the right size, no?

Right now, it's true that there will never be more than one alpha map, so
you might say the HashMap is overkill, but I don't think this is a problem
because performance wise it costs nearly nothing and I think the code is easier
to read now.
But it's not a big deal, I can change it back if you want.

Regards,
Denis.

- Jim Graham james.gra...@oracle.com wrote:

 Hi Denis,
 
 On 10/8/2010 8:53 AM, Denis Lila wrote:
 
   ...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-12 Thread Jim Graham

Hi Denis,

On 10/12/2010 6:01 AM, Denis Lila wrote:

Hi Jim.


2. I changed how the alpha map is managed in PiscesTileGenerator to
something that's a bit clearer and uses less memory (the latter comes
from changing the +300 in the alpha tile allocation to +1. If there was
a good reason for using 300, please tell me).


Did I do that?  Wow.  I wish I knew.  There were probably some bugs in
the alpha accumulation at some point.  Since it was indexed by a byte, I
find it hard to believe that it would need 300 entries of padding.


 I don't know who did it. I didn't mean to imply I thought it was you.
In hindsight, the wording of If there was a good reason for using 300,
please tell me was pretty terrible. I only asked because 300 seemed like a
very out of place number and I thought it was a bugfix, but I couldn't see
for what bug, so I thought you might know since you've helped me out in
this sort of situation before (i.e. sx0, sy0 in Dasher).


It was most likely me since this code hasn't been touched much since I 
hacked it together.  I wasn't put out by your comment, I was simply 
making a public showing of confusion to cover my embarrassment.  ;-)



One thing - will we ever need more than one alpha map in practice?  I
don't believe we will since it depends on the maxalpha from the Renderer
which is a fixed value.  So, the hashmap cache is probably overkill
compared to just seeing if the existing one is the right size, no?


Right now, it's true that there will never be more than one alpha map, so
you might say the HashMap is overkill, but I don't think this is a problem
because performance wise it costs nearly nothing and I think the code is easier
to read now.
But it's not a big deal, I can change it back if you want.


No, I think it's OK if it doesn't show up on any benchmarks...

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-08 Thread Denis Lila
Hello Jim.

Sorry for all the e-mails, but I made a couple of other
notable changes I should mention.
1. The optimization I described in one of my other e-mails that
allowed Renderer to skip subdivision of curves into monotonic
pieces if the curves came from Stroker is gone. There was a bug
with round joins and caps (the curves composing them would not be
monotonic). I could have fixed this, but there's also the problem
that the rest of Stroker's output curves can only be guaranteed to
be monotonic in x and y if we can find all points in the non-offset
curve where the radius of curvature == linewidth. My algorithm that
tries to find these points does so pretty well, but it doesn't always
find them all.

2. I changed how the alpha map is managed in PiscesTileGenerator to
something that's a bit clearer and uses less memory (the latter comes
from changing the +300 in the alpha tile allocation to +1. If there was
a good reason for using 300, please tell me).

That's all.

Regards,
Denis.

- Jim Graham james.gra...@oracle.com wrote:

 Hi Denis,
 
 On 9/27/2010 7:43 AM, Denis Lila wrote:
  Hi Jim.
  How much faster?  I'm worried about this, especially given our
 tiled
  approach to requesting the data.  What was the bottleneck before?
  (It's been a while since I visited the code - we weren't computing
 the
  crossings for every curve in the path for every tile being
 generated
  were we?)
 
   Not much faster. I'm working on someting better.
 
 Then hopefully we can get to something with better memory and CPU
 costs.
 
   I'm not sure about the bottleneck, but what we were doing
 before is:
  1. Flatten (by subdividing) every curve so that we deal only with
 lines.
  2. Add each line to a list sorted by y0. When end_rendering was
 called
  for each scanline we found the crossings of the scanline and every
 line
  in our line list, which we used to compute the alpha for that
 scanline's
  pixel row. All this would be put into RLE encoded temporary storage
 and it
  would be read back and converted into tile form by
 PiscesTileGenerator.
 
   Speaking of which, would it not be better to get rid of
 PiscesCache and
  just keep a buffer with the current tile row in Renderer.java. This
 would
  be possible because the specification for AATileGenerator says the
 iteration
  is like: for (y...) for (x...);.
  Why is PiscesCache there? It isn't being used as a cache at all.
 Could it be?
  Also, why do we output tiles, instead of just pixel rows (which I
 guess would
  just be nx1 tiles). Is it because we would like to use
 getTypicalAlpha to eliminate
  completely transparent or completely opaque regions as soon as
 possible (and the
  longer a tile is the less of a chance it has at being either of
 those two)?
 
 That was basically cramming what we had into the interface's box. 
 The 
 cache existed for something that was being done on mobile, but it 
 doesn't have much of a place in our APIs so it was just reused for
 tile 
 generation.  If we have a much more direct way of doing it then it
 would 
 be great to get rid of it.
 
 I think we can support ALL1s and ALL0s reasonably without the
 cache.
 
  I can see your points here.  I think there are solutions to avoid
 much
  of the untransforming we can consider, but your solution works well
 so
  let's get it in and then we can look at optimizations if we feel
 they
  are causing a measurable problem later.
 
   I should say this isn't quite as bad as I might have made it
 seem. Firstly,
  this IO handler class I made elimiinates transformations when
 Dasher
  communicates with Stroker. More importantly, no untransforming is
 done
  when the transformation is just a translation or is the identity or
 is singular
  and when STROKE_CONTROL is off, we only transform the output path.
 That's
  because the most important reason for handling transforms the way I
 do now
  is because we can't normalize untransformed paths, otherwise
 coordinate
  adjustments might be magnified too much. So, we need to transform
 paths
  before normalization. But we also can't do the stroking and
 widening
  before the normalization. But if normalization is removed we can
 just pass
  untransformed paths into Stroker, and transform its output (which is
 still
  somewhat more expensive than only trasnforming the input path,
 since
  Stroker produces many 3-7 curves for each input curve).
 
 Can the untransform be eliminated in the case of scaling?  (Whether
 just 
 for uniform scaling, or maybe even for non-uniform scaling with no 
 rotation or shearing?)
 
  I'm not sure I understand the reasoning of the control point
  calculation.  I'll have to look at the code to register an
 opinion.
 
   I'm sorry, my explanation wasn't very clear. I attached a
 picture that
  will hopefully clarify things.
  But, in a way, the computation I use is forced on us. Suppose we
 have a
  quadratic curve B and we need to compute one of its offsets C.
 C'(0)
  and C'(1) will be 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-09-29 Thread Denis Lila
Hi Jim.

 Then hopefully we can get to something with better memory and CPU costs.

I re-implemented the AA rasterizer twice more, with saving memory in
mind. The first version used a completely different algorithm for computing
crossings: It didn't do it incrementally. It just computed the t value 
where B(t)-CurrentScanline=0 using newton's method, then it computed
the x value at this t, and that's the crossing. Because all our curves
were monotonic in x and y, we could guarantee that there was exaclty one
such t per scanline per curve. The initial x0 in Newton's method was 
(scanline - y0)/(y1-y0) (where y1 and y0 are the y extrema of the curve).
This could be computed incrementally using just one addition per scanline,
and it worked very nicely in minimizing the iterations in Newton's method.
However, there still had to be at least one iteration, which came with at
least one multiplication and one division per scanline, which is much more
expensive than what adaptive forward differencing was doing. However, it
was still a bit faster than the adaptive forward differencing version,
probably since it didn't need to allocate ridiculous amounts of memory.
The memory usage of this was far better than anything we had had until then,
because for storing crossings it only needed 4*n bytes, where n is 
the highest number of crossings on any scanline, as opposed to 
4*numScanlines*n, which is what I had before. Since we store curves, instead
of the lines produced by flattening curves, this storage is also reduced by a 
lot.

But then I found a way to implement adaptive forward differencing AND save
memory. So, what I have now has the same memory usage described above, but
it's also a little faster (even now, that I haven't optimized it at all).
The webrev containing this isn't up yet (but everything else in the last
webrev link I sent is pretty much the same as what I have now on my machine,
so feel free to look at Stroker.java).

 Can the untransform be eliminated in the case of scaling?  (Whether just 
 for uniform scaling, or maybe even for non-uniform scaling with no 
 rotation or shearing?)

I'm glad you bring this up. I thought a bit about this, and the only thing
that causes problems in Stroker is that for some transformations, if we 
feed Stroker the transformed curve, the width will not be constant throughout
the curve. Therefore we can eliminate the untransforming for every matrix
that keeps these lengths constant. This turns out to be any constant multiples
of orthogonal matrices. So, if the transformation is A=[[a,b],[c,d]], all we
have to do is check for a*b==-c*d  a*a+c*c==b*b+d*d. If this is the case,
we can just make the pathIterator of the shape do the transforming, and we can
forget all about it (which is great, since what's hurting us is the 
transformation
of our output paths, not the untransformation of the input). 
So, to answer your question, we can't eliminate the untransforming for non
uniform scalings, but we can eliminate it for rotations, uniform transforms,
and even for shears of the form [[1,b],[-b,1]].

 You rock then!  A bug should be filed on closed JDK.  Can you file it or 
 send me your test case and I'll do it?

I filed it. Bug id: 6987950.

  Thank you,
 
 Ummm...  Thank *you*.  You're doing all the good work here, I'm just 
 sitting back, throwing out tiny crumbs of past experience and watching
 the ensuing woodchips fly with awe.  I've had on my wish list for some 
 time to be able to eliminate these last few closed source holdouts, but 
 the quality of the Ductus code was so high that I never got motivated to 
 try.  Who knows now...  ;-)

Well, I couldn't have done it without your help, so

Thank you,
Denis.

- Jim Graham james.gra...@oracle.com wrote:

 Hi Denis,
 
 On 9/27/2010 7:43 AM, Denis Lila wrote:
  Hi Jim.
  How much faster?  I'm worried about this, especially given our
 tiled
  approach to requesting the data.  What was the bottleneck before?
  (It's been a while since I visited the code - we weren't computing
 the
  crossings for every curve in the path for every tile being
 generated
  were we?)
 
   Not much faster. I'm working on someting better.
 
   I'm not sure about the bottleneck, but what we were doing
 before is:
  1. Flatten (by subdividing) every curve so that we deal only with
 lines.
  2. Add each line to a list sorted by y0. When end_rendering was
 called
  for each scanline we found the crossings of the scanline and every
 line
  in our line list, which we used to compute the alpha for that
 scanline's
  pixel row. All this would be put into RLE encoded temporary storage
 and it
  would be read back and converted into tile form by
 PiscesTileGenerator.
 
   Speaking of which, would it not be better to get rid of
 PiscesCache and
  just keep a buffer with the current tile row in Renderer.java. This
 would
  be possible because the specification for AATileGenerator says the
 iteration
  is like: for (y...) for (x...);.
  Why is 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-09-27 Thread Jim Graham

Hi Denis,

On 9/27/2010 7:43 AM, Denis Lila wrote:

Hi Jim.

How much faster?  I'm worried about this, especially given our tiled
approach to requesting the data.  What was the bottleneck before?
(It's been a while since I visited the code - we weren't computing the
crossings for every curve in the path for every tile being generated
were we?)


 Not much faster. I'm working on someting better.


Then hopefully we can get to something with better memory and CPU costs.


 I'm not sure about the bottleneck, but what we were doing before is:
1. Flatten (by subdividing) every curve so that we deal only with lines.
2. Add each line to a list sorted by y0. When end_rendering was called
for each scanline we found the crossings of the scanline and every line
in our line list, which we used to compute the alpha for that scanline's
pixel row. All this would be put into RLE encoded temporary storage and it
would be read back and converted into tile form by PiscesTileGenerator.

 Speaking of which, would it not be better to get rid of PiscesCache and
just keep a buffer with the current tile row in Renderer.java. This would
be possible because the specification for AATileGenerator says the iteration
is like: for (y...) for (x...);.
Why is PiscesCache there? It isn't being used as a cache at all. Could it be?
Also, why do we output tiles, instead of just pixel rows (which I guess would
just be nx1 tiles). Is it because we would like to use getTypicalAlpha to 
eliminate
completely transparent or completely opaque regions as soon as possible (and the
longer a tile is the less of a chance it has at being either of those two)?


That was basically cramming what we had into the interface's box.  The 
cache existed for something that was being done on mobile, but it 
doesn't have much of a place in our APIs so it was just reused for tile 
generation.  If we have a much more direct way of doing it then it would 
be great to get rid of it.


I think we can support ALL1s and ALL0s reasonably without the cache.


I can see your points here.  I think there are solutions to avoid much
of the untransforming we can consider, but your solution works well so
let's get it in and then we can look at optimizations if we feel they
are causing a measurable problem later.


 I should say this isn't quite as bad as I might have made it seem. Firstly,
this IO handler class I made elimiinates transformations when Dasher
communicates with Stroker. More importantly, no untransforming is done
when the transformation is just a translation or is the identity or is singular
and when STROKE_CONTROL is off, we only transform the output path. That's
because the most important reason for handling transforms the way I do now
is because we can't normalize untransformed paths, otherwise coordinate
adjustments might be magnified too much. So, we need to transform paths
before normalization. But we also can't do the stroking and widening
before the normalization. But if normalization is removed we can just pass
untransformed paths into Stroker, and transform its output (which is still
somewhat more expensive than only trasnforming the input path, since
Stroker produces many 3-7 curves for each input curve).


Can the untransform be eliminated in the case of scaling?  (Whether just 
for uniform scaling, or maybe even for non-uniform scaling with no 
rotation or shearing?)



I'm not sure I understand the reasoning of the control point
calculation.  I'll have to look at the code to register an opinion.


 I'm sorry, my explanation wasn't very clear. I attached a picture that
will hopefully clarify things.
But, in a way, the computation I use is forced on us. Suppose we have a
quadratic curve B and we need to compute one of its offsets C. C'(0)
and C'(1) will be parallel to B'(0) and B'(1) so we need to make sure
our computed offset has this property too (or it would look weird around
the endpoints). Now, B'(0) and B'(1) are parallel to p2-p1 and p3-p2
where p1,p2,p3 are the 3 control points that define B, so if the control
points of C are q1, q2, q3 then q2-q1 and q3-q2 must be parallel to p2-p1
and p3-p2 respectively. At this point, we need more constraint, since
our system is underdetermined. We use the constraints that q1 = C(0)
and q3 = C(1) (so, the endpoints of the computed offset are equal to the
endpoints of the ideal offset). All we have left to compute is q2, but
we know the direction of q2-q1 and the direction of q3-q2, so q2 must
lie on the lines defined by q1+t*(q2-q1) and q3+t*(q3-q2) so q2 must
be the intersection of these lines.


I agree that if you are creating a parallel curve then intersection is 
the way to go.  I guess what I was potentially confused about was 
whether there are cases where you need to subdivide at all?  Regardless 
of subdivision, when you get down to the final step of creating the 
parallel curves then I believe offsetting and finding the intersection 
is correct (though I reserve the possibility 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-09-22 Thread Denis Lila
Hello Jim.

 I'll take a look at this if I can, but last minute JavaOne issues (like 
 a rewrite of my slides - gulp) are starting to take over my life for the 
 next couple of weeks.

That's ok. That webrev wasn't as polished as I thought. I've done a lot of
testing since then and fixed many bugs. I ran Java2D without any problems.
The frame rate for the animations isn't noticeably different, but that's because
they don't draw many curves. I put timers in pisces and the time spent in it
(including line/move/quad/curve calls to the output PathConsumer2D) is at least
3-4 times smaller than in pisces without this changeset. When many curves are
widened the improvement goes up to a factor of 17-20.
Anti aliasing was also a bit better (this did come with a frame rate 
improvement). 

I also ran a 2D graphics test suite: http://icedtea.classpath.org/hg/gfx-test/
It generates thousands of images using combinations of different strokes
colours and shapes. It is fairly exhaustive, in that it uses all caps, all joins
many different line widths, different dashing patterns, different colours,
different shapes, and so on. It does this for a java implementation to be tested
and for a reference implementation and compares the generated images against 
each
other. I've been using the closed source java as a reference. Compared to
icedtea6 version 1.8, openjdk7 with my patch does much better. The number of
generated images that are identical to closed java has more than doubled. No
test suite performs worse and every image I've seen is closer to the reference
images. I have not put any of the test results online yet because that would be
400 megabytes and I'm not sure I'm allowed. I'll try tomorrow.

I've also rendered at least 5 random curves using this changeset
just to make sure there are no catastrophic failures (things like crashes, half
the screen going black, etc.) Everything looked good.

I should give a high level overview of how things have changed:
1) Antialiasing uses adaptive forward differencing. Now I rasterize each curve
as soon as I receive it and store the crossings instead of storing curves and 
computing
the crossings as needed. This is faster, but not as memory friendly so I'm a 
bit uneasy
about this decision. What do you think?
2) In Dasher.java I implemented the buffer to store initial segments.
3) For dashing, I compute curve lengths using the algorithm you told me about.
4) Transforms are handled differently. We used to transform everything before 
feeding
it to pisces. Since pisces has to compute offsets, it needed to know about these
transforms. This made things very confusing. I have introduced a class which 
Stroker
and Dasher use for IO that knows about transformations. So, when a shape is 
being
rendered its path iterator transforms the points. These transformed points are
then normalized (if normalization is on). Then they are passed through this IO
handler which untransforms the points and feeds them to Dasher or Stroker, which
pass their output through the IO handler again which transforms them. 
Unfortunately,
this will do many more transformations than before. The reason I did this is to
avoid going back and forth between user space and device space coordinates in 
the
same file.
5) In stroker, we used to keep variables that stored the previous point 
(px0,py0)
and the second point (sx1 and sy1, I think). I eliminated these. They were
misleading and unnecessary and just don't make sense if we have curves. They 
were
misleading because the only way they were ever used was to get tangent vectors 
at
the start and current position in the path. I replaced them with variables that
hold these tangents. This is much clearer and saves us many subtractions. 
Because
of this some of the methods parameters have changed. The computeMiter parameters
are a good example of ones that should have changed but didn't because I didn't
have time to refactor it. This should be done because if we use vectors it will
be clearer and will help with 9).
6) 0 length curves and lines were being ignored. This isn't what proprietary 
java
does (which is drawing caps/joins as if a tiny horizontal line going right had
just been drawn). I fixed this to match the behaviour of proprietary java.
Because of the change above, this turned out to be a 3 liner.
7) I put code that draws joins in its own method (drawJoin). This made the close
and finish methods much clearer and allowed me to fix this createStrokedShape
issue: http://bugs.openjdk.java.net/show_bug.cgi?id=100046
8) To compute cubic offset curves first I subdivide the original curve at points
where it starts bending too much (i.e. more than 90 degrees since the last
subdivision), has inflection points, and where one of the offset curves has
cusps (see comments in the file for more on this). Finding the offset cusps was
the main reason for 4, since if we worked with transformed coordinates there
could be shears and the linewidth would not be constant (we need 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-09-13 Thread Denis Lila
Hello Jim.

I think I finally have a version without correctness problems:
http://icedtea.classpath.org/~dlila/webrevs/noflatten/webrev/

Assuming no bugs, there are still a few minor issues:
- whitespace (I'll fix this tomorrow)
- comments (also tomorrow)
- in dasher, there are variables called sx1, sy1. They seem useless
to me. It would help a lot if they are. Could you please look at this?

If anything at all is confusing in it, please contact me (e-mail or irc:
I'm on OFTC #openjdk. My nickname is dlila).

Thank you,
Denis.

- Jim Graham james.gra...@oracle.com wrote:

 Hi Denis,
 
 Things got really busy for me over the past week so I wasn't able to 
 keep up with the discussion on this, but I will be looking more at it
 
 next week.  In the meantime it sounds like you are on the right track.
 
 I wish I'd have investigated it to the level you are at so I could be
 of 
 more immediate help, but hopefully I'll get there when I review your 
 various changes...
 
   ...jim
 
 On 9/7/2010 2:11 PM, Denis Lila wrote:
  Hello Jim.
 
  So, I finally have a webrev for serious consideration:
  http://icedtea.classpath.org/~dlila/webrevs/noflatten/webrev/
  There are still some printing statements I used for debugging, and
  the whitespace is probably pretty bad (tell me if this poses a
 problem
  when reading the code, and I'll clean it up), but I don't want to
  waste time removing that stuff unless necessary, since this is
  doubtlessly not the last version. I also included a Test.java
  file that I found useful for testing and debugging. It has a main
  method, and it allows pisces to run as a standalong project in
  eclipse (as long as you set the JRE to be openjdk7 since it needs
  to know about AATileGenerator and some other non public
 interfaces).
 
   From testing it, the only problem I noticed is that it doesn't do
  very well with tight loops. So, a path like
  p.moveTo(0,0);p.curveTo(1000, 1000, 400, 500, 0, -150);
  isn't stroked very well when using the rotating algorithm. When
 using
  just the make monotonic algorithm it is ok (right now, it is set
 to
  use the latter - you can change this by uncommenting
 Stroker.java:1011
  and commenting out Stroker.java:1012). This leads me to believe
 that
  we need to detect and perhaps subdivide at loops in addition to
 the
  current subdivision locations. However, I have not yet looked too
 deeply
  into why the problem arises and how to fix it. I welcome
 suggestions.
 
  Thanks,
  Denis.
 
  I figured out what the problem is. The problem isn't really tight
 loops.
  The problem is cusps in the offset curves. These happen when the
 line width
  is equal to the radius of curvature of the curve being processed
 (although,
  this may be just a necessary condition and not sufficient, but this
 doesn't
  matter).
 
  It seems like we have to split at values of t where the above
 condition
  holds. However, I can't see a way to do this without resorting to
 Newton's method
  for finding the roots of RadiusOfCurvature(t) - lineWidth. It would
 be
  really easy, however, if we had the arc length parametrization of
 the curve
  in question, but this won't necessarily be a polynomial. A good way
 might be
  to find a polynomial approximation to its inverse (this would make
 dashing considerably
  easier too).
 
  Regards,
  Denis.
 
  - Denis Liladl...@redhat.com  wrote:
 
 
 
  - Jim Grahamjames.gra...@oracle.com  wrote:
 
  OK, I see.  You were doubting that the thing that came after
  Pisces
 
  could be that much different considering that Pisces is rendering
  many
 
  more sub-pixels.
 
  Actually, embarrassingly I think it can.  It just means the
 non-AA
  renderer has some performance issues.  One thing I can think of
 is
  that
  the SpanShapeIterator uses a native method call per path segment
 and
  the
  cost of the context switches into native and back for each path
  segment
  dominate the performance of long paths.  It was something I was
  meaning
  to fix for a long time (when that code was first written native
 code
  was
  so much faster than Java and the native transition was quick -
 since
  then Hotspot came along, got a lot better, and the native
  transitions
 
  got much, much slower).
 
  So, yes, this isn't out of the question...
 
...jim
 
  On 9/2/2010 3:40 PM, Denis Lila wrote:
  Use which?  The stroking code or the rendering code?
  I believe that the way I set it up was that Pisces replaced
 both
  the
  stroke widening/dashing code and the AA renderer - both were
  parts
  that
  we relied on Ductus for.  But, the widening code would talk to
  one
  of
  our other existing rasterizers for non-AA.  Look at
  LoopPipe.draw(sg2d, s).  It (eventually) calls
  RenderEngine.strokeTo()
  directed at a SpanShapeIterator...
 
  I think there's a misunderstanding. All I meant was that, even
  when
  AA is off,
  we do use pisces for widening, but it doesn't do any
  rasterization.
 
 
  

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-09-10 Thread Jim Graham

Hi Denis,

Things got really busy for me over the past week so I wasn't able to 
keep up with the discussion on this, but I will be looking more at it 
next week.  In the meantime it sounds like you are on the right track. 
I wish I'd have investigated it to the level you are at so I could be of 
more immediate help, but hopefully I'll get there when I review your 
various changes...


...jim

On 9/7/2010 2:11 PM, Denis Lila wrote:

Hello Jim.

So, I finally have a webrev for serious consideration:
http://icedtea.classpath.org/~dlila/webrevs/noflatten/webrev/
There are still some printing statements I used for debugging, and
the whitespace is probably pretty bad (tell me if this poses a problem
when reading the code, and I'll clean it up), but I don't want to
waste time removing that stuff unless necessary, since this is
doubtlessly not the last version. I also included a Test.java
file that I found useful for testing and debugging. It has a main
method, and it allows pisces to run as a standalong project in
eclipse (as long as you set the JRE to be openjdk7 since it needs
to know about AATileGenerator and some other non public interfaces).

 From testing it, the only problem I noticed is that it doesn't do
very well with tight loops. So, a path like
p.moveTo(0,0);p.curveTo(1000, 1000, 400, 500, 0, -150);
isn't stroked very well when using the rotating algorithm. When using
just the make monotonic algorithm it is ok (right now, it is set to
use the latter - you can change this by uncommenting Stroker.java:1011
and commenting out Stroker.java:1012). This leads me to believe that
we need to detect and perhaps subdivide at loops in addition to the
current subdivision locations. However, I have not yet looked too deeply
into why the problem arises and how to fix it. I welcome suggestions.

Thanks,
Denis.


I figured out what the problem is. The problem isn't really tight loops.
The problem is cusps in the offset curves. These happen when the line width
is equal to the radius of curvature of the curve being processed (although,
this may be just a necessary condition and not sufficient, but this doesn't
matter).

It seems like we have to split at values of t where the above condition
holds. However, I can't see a way to do this without resorting to Newton's 
method
for finding the roots of RadiusOfCurvature(t) - lineWidth. It would be
really easy, however, if we had the arc length parametrization of the curve
in question, but this won't necessarily be a polynomial. A good way might be
to find a polynomial approximation to its inverse (this would make dashing 
considerably
easier too).

Regards,
Denis.

- Denis Liladl...@redhat.com  wrote:




- Jim Grahamjames.gra...@oracle.com  wrote:


OK, I see.  You were doubting that the thing that came after

Pisces


could be that much different considering that Pisces is rendering

many


more sub-pixels.

Actually, embarrassingly I think it can.  It just means the non-AA
renderer has some performance issues.  One thing I can think of is
that
the SpanShapeIterator uses a native method call per path segment and
the
cost of the context switches into native and back for each path
segment
dominate the performance of long paths.  It was something I was
meaning
to fix for a long time (when that code was first written native code
was
so much faster than Java and the native transition was quick - since
then Hotspot came along, got a lot better, and the native

transitions


got much, much slower).

So, yes, this isn't out of the question...

...jim

On 9/2/2010 3:40 PM, Denis Lila wrote:

Use which?  The stroking code or the rendering code?
I believe that the way I set it up was that Pisces replaced both

the

stroke widening/dashing code and the AA renderer - both were

parts

that

we relied on Ductus for.  But, the widening code would talk to

one

of

our other existing rasterizers for non-AA.  Look at
LoopPipe.draw(sg2d, s).  It (eventually) calls

RenderEngine.strokeTo()

directed at a SpanShapeIterator...


I think there's a misunderstanding. All I meant was that, even

when

AA is off,

we do use pisces for widening, but it doesn't do any

rasterization.



- Jim Grahamjames.gra...@oracle.com   wrote:


...jim

On 9/2/2010 3:20 PM, Denis Lila wrote:

Do we use Pisces for non-AA?  Pisces should clock in slower for

AA

than

non-AA, but I think we use one of the other pipes (not Ductus)

for

non-AA and maybe it just isn't as good as Pisces?


We definitely use it for non-AA.
I traced it.

Denis.

- Jim Grahamjames.gra...@oracle.comwrote:


On 9/2/2010 2:43 PM, Denis Lila wrote:

Actually, I had a question about the test I wrote which takes

20

seconds. When

I turned antialiasing on, the test dropped from 20 seconds to

2.5.

This is very

puzzling, since antialiasing is a generalization of

non-antialiased

rendering

(a generalization where we pretend there are 64 times more


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-09-07 Thread Denis Lila
 Hello Jim.
 
 So, I finally have a webrev for serious consideration:
 http://icedtea.classpath.org/~dlila/webrevs/noflatten/webrev/
 There are still some printing statements I used for debugging, and
 the whitespace is probably pretty bad (tell me if this poses a problem
 when reading the code, and I'll clean it up), but I don't want to
 waste time removing that stuff unless necessary, since this is
 doubtlessly not the last version. I also included a Test.java
 file that I found useful for testing and debugging. It has a main
 method, and it allows pisces to run as a standalong project in
 eclipse (as long as you set the JRE to be openjdk7 since it needs
 to know about AATileGenerator and some other non public interfaces).
 
 From testing it, the only problem I noticed is that it doesn't do
 very well with tight loops. So, a path like
 p.moveTo(0,0);p.curveTo(1000, 1000, 400, 500, 0, -150);
 isn't stroked very well when using the rotating algorithm. When using
 just the make monotonic algorithm it is ok (right now, it is set to
 use the latter - you can change this by uncommenting Stroker.java:1011
 and commenting out Stroker.java:1012). This leads me to believe that
 we need to detect and perhaps subdivide at loops in addition to the
 current subdivision locations. However, I have not yet looked too deeply
 into why the problem arises and how to fix it. I welcome suggestions.
 
 Thanks,
 Denis.

I figured out what the problem is. The problem isn't really tight loops.
The problem is cusps in the offset curves. These happen when the line width
is equal to the radius of curvature of the curve being processed (although,
this may be just a necessary condition and not sufficient, but this doesn't
matter).

It seems like we have to split at values of t where the above condition
holds. However, I can't see a way to do this without resorting to Newton's 
method
for finding the roots of RadiusOfCurvature(t) - lineWidth. It would be
really easy, however, if we had the arc length parametrization of the curve
in question, but this won't necessarily be a polynomial. A good way might be
to find a polynomial approximation to its inverse (this would make dashing 
considerably
easier too).

Regards,
Denis.

- Denis Lila dl...@redhat.com wrote:


 
 - Jim Graham james.gra...@oracle.com wrote:
 
  OK, I see.  You were doubting that the thing that came after
 Pisces
 
  could be that much different considering that Pisces is rendering
 many
 
  more sub-pixels.
 
  Actually, embarrassingly I think it can.  It just means the non-AA
  renderer has some performance issues.  One thing I can think of is
  that
  the SpanShapeIterator uses a native method call per path segment and
  the
  cost of the context switches into native and back for each path
  segment
  dominate the performance of long paths.  It was something I was
  meaning
  to fix for a long time (when that code was first written native code
  was
  so much faster than Java and the native transition was quick - since
  then Hotspot came along, got a lot better, and the native
 transitions
 
  got much, much slower).
 
  So, yes, this isn't out of the question...
 
  ...jim
 
  On 9/2/2010 3:40 PM, Denis Lila wrote:
   Use which?  The stroking code or the rendering code?
   I believe that the way I set it up was that Pisces replaced both
  the
   stroke widening/dashing code and the AA renderer - both were
 parts
  that
   we relied on Ductus for.  But, the widening code would talk to
 one
  of
   our other existing rasterizers for non-AA.  Look at
   LoopPipe.draw(sg2d, s).  It (eventually) calls
  RenderEngine.strokeTo()
   directed at a SpanShapeIterator...
  
   I think there's a misunderstanding. All I meant was that, even
 when
  AA is off,
   we do use pisces for widening, but it doesn't do any
 rasterization.
  
  
   - Jim Grahamjames.gra...@oracle.com  wrote:
  
...jim
  
   On 9/2/2010 3:20 PM, Denis Lila wrote:
   Do we use Pisces for non-AA?  Pisces should clock in slower for
  AA
   than
   non-AA, but I think we use one of the other pipes (not Ductus)
  for
   non-AA and maybe it just isn't as good as Pisces?
  
   We definitely use it for non-AA.
   I traced it.
  
   Denis.
  
   - Jim Grahamjames.gra...@oracle.com   wrote:
  
   On 9/2/2010 2:43 PM, Denis Lila wrote:
   Actually, I had a question about the test I wrote which takes
  20
   seconds. When
   I turned antialiasing on, the test dropped from 20 seconds to
   2.5.
   This is very
   puzzling, since antialiasing is a generalization of
   non-antialiased
   rendering
   (a generalization where we pretend there are 64 times more
  pixels
   than there
   actually are). Of course, the paths followed after pisces for
  AA
   and
   non-AA are
   completely different, but whatever came after pisces in the
   non-AA
   case would
   have the same input as Renderer has in the AA case (input
  gotten
   from Stroker).
   Can you take a guess 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-09-03 Thread Denis Lila
 the cost of the context switches into native and back for each path
 segment dominate the performance of long paths.

I see. That makes sense.

 It was something I was meaning to fix for a long time (when that code
 was first written native code was so much faster than Java and the 
 native transition was quick - since then Hotspot came along, got a 
 lot better, and the native transitions got much, much slower).

Do you think this will still be worth it after removing flattening?

Thanks,
Denis.

- Jim Graham james.gra...@oracle.com wrote:

 OK, I see.  You were doubting that the thing that came after Pisces 
 could be that much different considering that Pisces is rendering many 
 more sub-pixels.
 
 Actually, embarrassingly I think it can.  It just means the non-AA 
 renderer has some performance issues.  One thing I can think of is
 that 
 the SpanShapeIterator uses a native method call per path segment and
  
 
 So, yes, this isn't out of the question...
 
   ...jim
 
 On 9/2/2010 3:40 PM, Denis Lila wrote:
  Use which?  The stroking code or the rendering code?
  I believe that the way I set it up was that Pisces replaced both
 the
  stroke widening/dashing code and the AA renderer - both were parts
 that
  we relied on Ductus for.  But, the widening code would talk to one
 of
  our other existing rasterizers for non-AA.  Look at
  LoopPipe.draw(sg2d, s).  It (eventually) calls
 RenderEngine.strokeTo()
  directed at a SpanShapeIterator...
 
  I think there's a misunderstanding. All I meant was that, even when
 AA is off,
  we do use pisces for widening, but it doesn't do any rasterization.
 
 
  - Jim Grahamjames.gra...@oracle.com  wrote:
 
 ...jim
 
  On 9/2/2010 3:20 PM, Denis Lila wrote:
  Do we use Pisces for non-AA?  Pisces should clock in slower for
 AA
  than
  non-AA, but I think we use one of the other pipes (not Ductus)
 for
  non-AA and maybe it just isn't as good as Pisces?
 
  We definitely use it for non-AA.
  I traced it.
 
  Denis.
 
  - Jim Grahamjames.gra...@oracle.com   wrote:
 
  On 9/2/2010 2:43 PM, Denis Lila wrote:
  Actually, I had a question about the test I wrote which takes
 20
  seconds. When
  I turned antialiasing on, the test dropped from 20 seconds to
  2.5.
  This is very
  puzzling, since antialiasing is a generalization of
  non-antialiased
  rendering
  (a generalization where we pretend there are 64 times more
 pixels
  than there
  actually are). Of course, the paths followed after pisces for
 AA
  and
  non-AA are
  completely different, but whatever came after pisces in the
  non-AA
  case would
  have the same input as Renderer has in the AA case (input
 gotten
  from Stroker).
  Can you take a guess as to what was causing such a large
  difference?
 
 
 
  I think Pisces was integrated only as a Ductus replacement which
  means
 
  it was used only for AA, but check if I'm mistaken...
 
   ...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-09-03 Thread Denis Lila
Hello Jim.

So, I finally have a webrev for serious consideration:
http://icedtea.classpath.org/~dlila/webrevs/noflatten/webrev/
There are still some printing statements I used for debugging, and 
the whitespace is probably pretty bad (tell me if this poses a problem
when reading the code, and I'll clean it up), but I don't want to 
waste time removing that stuff unless necessary, since this is 
doubtlessly not the last version. I also included a Test.java 
file that I found useful for testing and debugging. It has a main
method, and it allows pisces to run as a standalong project in 
eclipse (as long as you set the JRE to be openjdk7 since it needs
to know about AATileGenerator and some other non public interfaces).

From testing it, the only problem I noticed is that it doesn't do
very well with tight loops. So, a path like 
p.moveTo(0,0);p.curveTo(1000, 1000, 400, 500, 0, -150);
isn't stroked very well when using the rotating algorithm. When using
just the make monotonic algorithm it is ok (right now, it is set to
use the latter - you can change this by uncommenting Stroker.java:1011
and commenting out Stroker.java:1012). This leads me to believe that
we need to detect and perhaps subdivide at loops in addition to the
current subdivision locations. However, I have not yet looked too deeply
into why the problem arises and how to fix it. I welcome suggestions.

Thanks,
Denis.

- Jim Graham james.gra...@oracle.com wrote:

 OK, I see.  You were doubting that the thing that came after Pisces
 
 could be that much different considering that Pisces is rendering many
 
 more sub-pixels.
 
 Actually, embarrassingly I think it can.  It just means the non-AA 
 renderer has some performance issues.  One thing I can think of is
 that 
 the SpanShapeIterator uses a native method call per path segment and
 the 
 cost of the context switches into native and back for each path
 segment 
 dominate the performance of long paths.  It was something I was
 meaning 
 to fix for a long time (when that code was first written native code
 was 
 so much faster than Java and the native transition was quick - since 
 then Hotspot came along, got a lot better, and the native transitions
 
 got much, much slower).
 
 So, yes, this isn't out of the question...
 
   ...jim
 
 On 9/2/2010 3:40 PM, Denis Lila wrote:
  Use which?  The stroking code or the rendering code?
  I believe that the way I set it up was that Pisces replaced both
 the
  stroke widening/dashing code and the AA renderer - both were parts
 that
  we relied on Ductus for.  But, the widening code would talk to one
 of
  our other existing rasterizers for non-AA.  Look at
  LoopPipe.draw(sg2d, s).  It (eventually) calls
 RenderEngine.strokeTo()
  directed at a SpanShapeIterator...
 
  I think there's a misunderstanding. All I meant was that, even when
 AA is off,
  we do use pisces for widening, but it doesn't do any rasterization.
 
 
  - Jim Grahamjames.gra...@oracle.com  wrote:
 
 ...jim
 
  On 9/2/2010 3:20 PM, Denis Lila wrote:
  Do we use Pisces for non-AA?  Pisces should clock in slower for
 AA
  than
  non-AA, but I think we use one of the other pipes (not Ductus)
 for
  non-AA and maybe it just isn't as good as Pisces?
 
  We definitely use it for non-AA.
  I traced it.
 
  Denis.
 
  - Jim Grahamjames.gra...@oracle.com   wrote:
 
  On 9/2/2010 2:43 PM, Denis Lila wrote:
  Actually, I had a question about the test I wrote which takes
 20
  seconds. When
  I turned antialiasing on, the test dropped from 20 seconds to
  2.5.
  This is very
  puzzling, since antialiasing is a generalization of
  non-antialiased
  rendering
  (a generalization where we pretend there are 64 times more
 pixels
  than there
  actually are). Of course, the paths followed after pisces for
 AA
  and
  non-AA are
  completely different, but whatever came after pisces in the
  non-AA
  case would
  have the same input as Renderer has in the AA case (input
 gotten
  from Stroker).
  Can you take a guess as to what was causing such a large
  difference?
 
 
 
  I think Pisces was integrated only as a Ductus replacement which
  means
 
  it was used only for AA, but check if I'm mistaken...
 
   ...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-09-03 Thread Jim Graham

On 9/3/2010 6:03 AM, Denis Lila wrote:

the cost of the context switches into native and back for each path
segment dominate the performance of long paths.


I see. That makes sense.


It was something I was meaning to fix for a long time (when that code
was first written native code was so much faster than Java and the
native transition was quick - since then Hotspot came along, got a
lot better, and the native transitions got much, much slower).


Do you think this will still be worth it after removing flattening?


That depends on the performance differential after your de-flattening 
fixes.  Are both now relatively close in performance?


Either way I imagine that performance will improve if we reduce the 
native interface transitions - it just may change in relative priority 
if your new widener is less abusive towards it...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-09-02 Thread Denis Lila
Hello Jim.

Sorry for all the e-mails.
I implemented the rotating version. The rotation introduces small
numerical errors, so when I solve for the roots of dx/dt and dy/dt,
the values of t I get are slightly off. So for a rotated quarter
circle, what happens is that it gives me a root at, say, t=0.9996...
so I still end up subdividing quarter circles. I fixed this by
ignoring all roots outside of (0.001,0.999), instead of ignoring
roots outside of (0,1) which is the ideal solution. I don't like
hardcoding constants like this. Add to this the higher complexity
of the rotation, and I'm tempted to say it might be better to just
bite the bullet on rotated quarter circles and widen them using 2
curves per side.

I had one question: what do you think of widening all curves using
quadratic curves? Their great simplicity might make things safer. We
can guarantee that none of the strange behaviour I've seen and described
with cubic curves will arise (but we'd have to use more of them).

Also, I've tested the current implementation with a few hundred random
cubic paths, and everything looks good so far.

Thanks,
Denis.

- Denis Lila dl...@redhat.com wrote:

 Hello Jim.
 
  I think this would actually ensure that every resulting curve can
  be rotated to be made monotonic in both x and y (at least after
  inflections are eliminated), which means it's at least as good as
  what I have now.
 
 While that first statement is true, it unfortunately does not mean
 it's
 at least as good as breaking the curve into monotonic pieces. I
 implemented the angle checking method, and the problem is that for
 curves like (0,0),(1000,1),(-1000,1),(0,0) it is very bad. That's
 because I implemented it by subdividing at t=0.5, so the angles
 in the resulting curves' polygons are still wide. After enough
 subdivisions the polygons would have angles below 45, but that would
 defeat the point, since we're trying to minimize the number of output
 curves.
 I can't think of a good way to chose the subdivision t so that
 this method is as good as the rotation and make monotonic one (nothing
 with any properties I can prove, anyway), so I'll implement the
 rotating version, as much as I don't like it. At least it gives a good
 upper bound in the number of output curves.
 
 Regards,
 Denis.
 
 - Denis Lila dl...@redhat.com wrote:
 
  Hello Jim.
  Thanks for taking the time to think about this.
 
  I implemented a few different offsetting schemes. On well behaved
  curves, my original parallel first and last control vectors and go
  through B(0.5) algorithm was by far the best. Theoretically it is
  also somewhat better justified than the others (some of which were
  like Apache Harmony's way - offsetting p2 and p3), since we're
  interpolating instead of just using some heuristic. It is also
  the only one I've encountered that widens quarter circles optimally,
  and it only needs one curve per side too (i.e. if we have to widen
  the path of a PathIterator of a circle with radius r, using width w,
  Pisces' output will be identical to the output of the PathIterators
  of 2 circles with radii r+w and r-w (perhaps not identical
 identical,
  since PathIterators can use doubles, and we only use floats in
 pisces,
  but... that's nitpicking)).
 
  As I've said before, the only 2 problems with it were that it was
 bad
  on high curvatures (but this was fixed by breaking down the curves
  into
  monotonic ones), and it was bad on curves like
  p.moveTo(0,0);p.curveTo(100,100,0,100,100,1). I thought the latter
 was
  fixable with the d1/(d1+d2) algorithm I suggested in my previous
  e-mail.
  I implemented it, and it turns out I was wrong. Then I came up with
 my
  own
  variation of that algorithm (the original one I used was from Don
  Lancaster's
  website) that did sort of work. But then I implemented your
 suggestion
  of
  splitting the curve at the inflection points as well as breaking it
  into
  monotonic pieces, and the problem was gone. I didn't even have to
 use
  the
  fancy variation of the d1/(d1 + d2) algorithm - just the plain old
  interpolation one worked (I should say that the fancy one is still
  probably
  better, but undetectably so, now that we break at inflection points
  and at
  xy direction changes, so the added complexity is probably not worth
  it).
 
  I've attached my latest Stroker.java, if you want to take a look at
  these
  algorithms (they're in computeOffsetWay1 (for the old interpolation)
  and
  computeOffsetWay3 (for the fancy version). There are 2 more, but
 these
  aren't as good, and one is just shameful). I didn't make a webrev
  because
  I still need to fix how it handles cusps (which should be easy), and
 I
  need to implement a way to avoid subdividing a rotated quarter
 circle.
  I can do this either by rotating curves so that p2-p1 is parallel to
  the
  x axis and then subdivide like I do now (i.e. make monotonic, break
 at
  inflections) or get rid of the monotonic subdivision, and 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-09-02 Thread Denis Lila
 Do we use Pisces for non-AA?  Pisces should clock in slower for AA than 
 non-AA, but I think we use one of the other pipes (not Ductus) for 
 non-AA and maybe it just isn't as good as Pisces?

We definitely use it for non-AA.
I traced it.

Denis.

- Jim Graham james.gra...@oracle.com wrote:

 On 9/2/2010 2:43 PM, Denis Lila wrote:
  Actually, I had a question about the test I wrote which takes 20
 seconds. When
  I turned antialiasing on, the test dropped from 20 seconds to 2.5.
 This is very
  puzzling, since antialiasing is a generalization of non-antialiased
 rendering
  (a generalization where we pretend there are 64 times more pixels
 than there
  actually are). Of course, the paths followed after pisces for AA and
 non-AA are
  completely different, but whatever came after pisces in the non-AA
 case would
  have the same input as Renderer has in the AA case (input gotten
 from Stroker).
  Can you take a guess as to what was causing such a large
 difference?
 

 
 I think Pisces was integrated only as a Ductus replacement which means
 
 it was used only for AA, but check if I'm mistaken...
 
   ...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-09-02 Thread Denis Lila
 Use which?  The stroking code or the rendering code? 
 I believe that the way I set it up was that Pisces replaced both the 
 stroke widening/dashing code and the AA renderer - both were parts that 
 we relied on Ductus for.  But, the widening code would talk to one of 
 our other existing rasterizers for non-AA.  Look at 
 LoopPipe.draw(sg2d, s).  It (eventually) calls RenderEngine.strokeTo()
 directed at a SpanShapeIterator...

I think there's a misunderstanding. All I meant was that, even when AA is off,
we do use pisces for widening, but it doesn't do any rasterization.


- Jim Graham james.gra...@oracle.com wrote:

   ...jim
 
 On 9/2/2010 3:20 PM, Denis Lila wrote:
  Do we use Pisces for non-AA?  Pisces should clock in slower for AA
 than
  non-AA, but I think we use one of the other pipes (not Ductus) for
  non-AA and maybe it just isn't as good as Pisces?
 
  We definitely use it for non-AA.
  I traced it.
 
  Denis.
 
  - Jim Grahamjames.gra...@oracle.com  wrote:
 
  On 9/2/2010 2:43 PM, Denis Lila wrote:
  Actually, I had a question about the test I wrote which takes 20
  seconds. When
  I turned antialiasing on, the test dropped from 20 seconds to
 2.5.
  This is very
  puzzling, since antialiasing is a generalization of
 non-antialiased
  rendering
  (a generalization where we pretend there are 64 times more pixels
  than there
  actually are). Of course, the paths followed after pisces for AA
 and
  non-AA are
  completely different, but whatever came after pisces in the
 non-AA
  case would
  have the same input as Renderer has in the AA case (input gotten
  from Stroker).
  Can you take a guess as to what was causing such a large
  difference?
 
 
 
  I think Pisces was integrated only as a Ductus replacement which
 means
 
  it was used only for AA, but check if I'm mistaken...
 
 ...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-09-02 Thread Jim Graham
OK, I see.  You were doubting that the thing that came after Pisces 
could be that much different considering that Pisces is rendering many 
more sub-pixels.


Actually, embarrassingly I think it can.  It just means the non-AA 
renderer has some performance issues.  One thing I can think of is that 
the SpanShapeIterator uses a native method call per path segment and the 
cost of the context switches into native and back for each path segment 
dominate the performance of long paths.  It was something I was meaning 
to fix for a long time (when that code was first written native code was 
so much faster than Java and the native transition was quick - since 
then Hotspot came along, got a lot better, and the native transitions 
got much, much slower).


So, yes, this isn't out of the question...

...jim

On 9/2/2010 3:40 PM, Denis Lila wrote:

Use which?  The stroking code or the rendering code?
I believe that the way I set it up was that Pisces replaced both the
stroke widening/dashing code and the AA renderer - both were parts that
we relied on Ductus for.  But, the widening code would talk to one of
our other existing rasterizers for non-AA.  Look at
LoopPipe.draw(sg2d, s).  It (eventually) calls RenderEngine.strokeTo()
directed at a SpanShapeIterator...


I think there's a misunderstanding. All I meant was that, even when AA is off,
we do use pisces for widening, but it doesn't do any rasterization.


- Jim Grahamjames.gra...@oracle.com  wrote:


...jim

On 9/2/2010 3:20 PM, Denis Lila wrote:

Do we use Pisces for non-AA?  Pisces should clock in slower for AA

than

non-AA, but I think we use one of the other pipes (not Ductus) for
non-AA and maybe it just isn't as good as Pisces?


We definitely use it for non-AA.
I traced it.

Denis.

- Jim Grahamjames.gra...@oracle.com   wrote:


On 9/2/2010 2:43 PM, Denis Lila wrote:

Actually, I had a question about the test I wrote which takes 20

seconds. When

I turned antialiasing on, the test dropped from 20 seconds to

2.5.

This is very

puzzling, since antialiasing is a generalization of

non-antialiased

rendering

(a generalization where we pretend there are 64 times more pixels

than there

actually are). Of course, the paths followed after pisces for AA

and

non-AA are

completely different, but whatever came after pisces in the

non-AA

case would

have the same input as Renderer has in the AA case (input gotten

from Stroker).

Can you take a guess as to what was causing such a large

difference?





I think Pisces was integrated only as a Ductus replacement which

means


it was used only for AA, but check if I'm mistaken...

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-09-01 Thread Denis Lila
Hello Jim.

 I think this would actually ensure that every resulting curve can 
 be rotated to be made monotonic in both x and y (at least after 
 inflections are eliminated), which means it's at least as good as
 what I have now.

While that first statement is true, it unfortunately does not mean it's
at least as good as breaking the curve into monotonic pieces. I
implemented the angle checking method, and the problem is that for
curves like (0,0),(1000,1),(-1000,1),(0,0) it is very bad. That's
because I implemented it by subdividing at t=0.5, so the angles
in the resulting curves' polygons are still wide. After enough
subdivisions the polygons would have angles below 45, but that would
defeat the point, since we're trying to minimize the number of output
curves.
I can't think of a good way to chose the subdivision t so that
this method is as good as the rotation and make monotonic one (nothing
with any properties I can prove, anyway), so I'll implement the 
rotating version, as much as I don't like it. At least it gives a good
upper bound in the number of output curves.

Regards,
Denis.

- Denis Lila dl...@redhat.com wrote:

 Hello Jim.
 Thanks for taking the time to think about this.
 
 I implemented a few different offsetting schemes. On well behaved
 curves, my original parallel first and last control vectors and go
 through B(0.5) algorithm was by far the best. Theoretically it is
 also somewhat better justified than the others (some of which were
 like Apache Harmony's way - offsetting p2 and p3), since we're
 interpolating instead of just using some heuristic. It is also
 the only one I've encountered that widens quarter circles optimally,
 and it only needs one curve per side too (i.e. if we have to widen
 the path of a PathIterator of a circle with radius r, using width w,
 Pisces' output will be identical to the output of the PathIterators
 of 2 circles with radii r+w and r-w (perhaps not identical identical,
 since PathIterators can use doubles, and we only use floats in pisces,
 but... that's nitpicking)).
 
 As I've said before, the only 2 problems with it were that it was bad
 on high curvatures (but this was fixed by breaking down the curves
 into
 monotonic ones), and it was bad on curves like
 p.moveTo(0,0);p.curveTo(100,100,0,100,100,1). I thought the latter was
 fixable with the d1/(d1+d2) algorithm I suggested in my previous
 e-mail.
 I implemented it, and it turns out I was wrong. Then I came up with my
 own
 variation of that algorithm (the original one I used was from Don
 Lancaster's
 website) that did sort of work. But then I implemented your suggestion
 of
 splitting the curve at the inflection points as well as breaking it
 into
 monotonic pieces, and the problem was gone. I didn't even have to use
 the
 fancy variation of the d1/(d1 + d2) algorithm - just the plain old
 interpolation one worked (I should say that the fancy one is still
 probably
 better, but undetectably so, now that we break at inflection points
 and at
 xy direction changes, so the added complexity is probably not worth
 it).
 
 I've attached my latest Stroker.java, if you want to take a look at
 these
 algorithms (they're in computeOffsetWay1 (for the old interpolation)
 and
 computeOffsetWay3 (for the fancy version). There are 2 more, but these
 aren't as good, and one is just shameful). I didn't make a webrev
 because
 I still need to fix how it handles cusps (which should be easy), and I
 need to implement a way to avoid subdividing a rotated quarter circle.
 I can do this either by rotating curves so that p2-p1 is parallel to
 the
 x axis and then subdivide like I do now (i.e. make monotonic, break at
 inflections) or get rid of the monotonic subdivision, and instead
 subdivide
 by checking the angles of the control polygon. I could make it so it
 subdivides whenever the angles between p2-p1 and p3-p2 or p3-p2 and
 p4-p3
 are greater than 45 degrees. 
 
 Very well. I've convinced myself. I'll implement the latter.
 
 Thanks,
 Denis.
 
 - Jim Graham james.gra...@oracle.com wrote:
 
  Hi Denis,
 
  Just to let you know that I've seen this and I'm thinking.
 
  But it will take a bit of more thinking when I have time for more.
  Hopefully in a couple of days.
 
  For one thing, it sounds like you already understand the Apache
  Harmony
  approach quite a bit better than I ever did and, in particular, why
 it
 
  didn't work well - which is encouraging.  Hopefully your solution
 will
 
  sound pretty good when I get around to wrapping my head around it...
 
  ...jim
 
  On 8/30/2010 3:03 PM, Denis Lila wrote:
   Hello Jim.
  
   One way in which they may not break enough is that I think that
   inflections also need to be broken in order to find a parallel
  curve
   (though I suppose a very tiny inflection might still be
  approximated by
   a parallel curve easily) and inflections can happen at any angle
  without
   going horizontal or vertical.
  
It wouldn't be hard 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-08-25 Thread Denis Lila
Hello Jim.

 I think a more dynamic approach that looked at how regular the curve
 was would do better.  Regular is hard to define, but for instance a 
 bezier section of a circle could have parallel curves computed very 
 easily without having to flatten or subdivide further.  Curves with 
 inflections probably require subdividing to get an accurate parallel
 curve.

I'm not sure if you read it, but after the email with the webrev link
I sent another describing a different method of widening: split the
curve at every t where dx/dt == 0 and dy/dt == 0. This guarantees that
there will be no more than 5 curves per side, and since each curve will
be monotonic in both x and y the curve that interpolates its parallel 
should do a pretty good job.

This is far better than interpolating at regular t intervals, but I'm 
trying to find a better way. I don't like this because the split
depend not only on the curve itself, but also on the axes. The axes are
arbitrary, so this is not good. For example a curve like this 
|
\_ would get widened by 1 curve per side (which is good and optimal), but
if we rotate this curve by, say, 30 degrees it would be widened by 2 curves
per side.
It also doesn't handle cusps and locations of high curvature very well (although
I think the latter is a numerical issue that could be made better by using 
doubles).

Regards,
Denis.

- Jim Graham james.gra...@oracle.com wrote:

 Hi Denis,
 
 On 8/23/2010 4:18 PM, Denis Lila wrote:
   To widen cubic curves, I use a cubic spline with a fixed number
 of curves for
  each curve to be widened. This was meant to be temporary, until I
 could find a
  better algorithm for determining the number of curves in the spline,
 but I
  discovered today that that won't do it.
   For example, the curve p.moveTo(0,0),p.curveTo(84.0, 62.0,
 32.0, 34.0, 28.0, 5.0)
  looks bad all the way up to ~200 curves. Obviously, this is
 unacceptable.
 
  It would be great if anyone has any better ideas for how to go about
 this.
  To me it seems like the problem is that in the webrev I chop up the
 curve to be
  interpolated at equal intervals of the parameter.

 
 Perhaps looking at the rate of change of the slope (2nd and/or 3rd 
 derivatives) would help to figure out when a given section of curve
 can 
 be approximated with a parallel version?
 
 I believe that the BasicStroke class of Apache Harmony returns widened
 
 curves, but when I tested it it produced a lot more curves than Ductus
 
 (still, probably a lot less than the lines that would be produced by 
 flattening) and it had some numerical problems.  In the end I decided
 to 
 leave our Ductus stuff in there until someone could come up with a
 more 
 reliable Open Source replacement, but hopefully that code is close 
 enough to be massaged into working order.
 
 You can search the internet for parallel curves and find lots of 
 literature on the subject.  I briefly looked through the web sites,
 but 
 didn't have enough time to remember enough calculus and trigonometry
 to 
 garner a solution out of it all with the time that I had.  Maybe
 you'll 
 have better luck following the algorithms... ;-)
 
 As far as flattening at the lowest level when doing scanline
 conversion, 
 I like the idea of using forward differencing as it can create an 
 algorithm that doesn't require all of the intermediate storage that a
 
 subdividing flattener requires.  One reference that describes the 
 technique is on Dr. Dobbs site, though I imagine there are many
 others:
 
 http://www.drdobbs.com/184403417;jsessionid=O5N5MDJRDMIKHQE1GHOSKH4ATMY32JVN
 
 You can also look at the code in 
 src/share/native/sun/java2d/pipe/ProcessPath.c for some examples of 
 forward differencing in use (and that code also has similar techniques
 
 to what you did to first chop the path into vertical pieces).  BUT 
 (*Nota Bene*), I must warn you that the geometry of the path is
 somewhat 
 perturbed in that code since it tries to combine path normalization
 
 and rasterization into a single process.  It's not rendering pure 
 geometry, it is rendering tweaked geometry which tries to make non-AA
 
 circles look round and other such aesthetics-targeted impurities. 
 While 
 the code does have good examples of how to set up and evaluate forward
 
 differencing equations, don't copy too many of the details or you
 might 
 end up copying some of the tweaks and the results will look strange 
 under AA.  The Dr. Dobbs article should be your numerical reference
 and 
 that reference code a practical, but incompatible, example...
 
   ...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-08-25 Thread Jim Graham

Hi Denis,

At the bottom-most rendering level monotonic curves can be cool to deal 
with, but I'm dubious that they help with widening.  For one things, I 
think you need more breaks than they would give you and also they might 
sometimes break a curve when it doesn't need it.


One way in which they may not break enough is that I think that 
inflections also need to be broken in order to find a parallel curve 
(though I suppose a very tiny inflection might still be approximated by 
a parallel curve easily) and inflections can happen at any angle without 
going horizontal or vertical.


Secondly, although a circle tends to be represented by quadrant sections 
which are monotonic, a circle rotated by 45 degrees would have 
horizontal and vertical sections in the middle of each quadrant and you 
would split those, but really they can be made parallel regardless of 
angle so these would be unnecessary splits.


My belief is that lengths and angles of the control polygon help 
determine if it is well-behaved and can be made parallel simply by 
offsetting.  Some formula involving those values would likely be happy 
with circle sections regardless of their angle of rotation.  I believe 
the Apache Harmony version of BasicStroke used those criteria...


...jim

On 8/25/2010 2:36 PM, Denis Lila wrote:

Hello Jim.


I think a more dynamic approach that looked at how regular the curve
was would do better.  Regular is hard to define, but for instance a
bezier section of a circle could have parallel curves computed very
easily without having to flatten or subdivide further.  Curves with
inflections probably require subdividing to get an accurate parallel
curve.


I'm not sure if you read it, but after the email with the webrev link
I sent another describing a different method of widening: split the
curve at every t where dx/dt == 0 and dy/dt == 0. This guarantees that
there will be no more than 5 curves per side, and since each curve will
be monotonic in both x and y the curve that interpolates its parallel
should do a pretty good job.

This is far better than interpolating at regular t intervals, but I'm
trying to find a better way. I don't like this because the split
depend not only on the curve itself, but also on the axes. The axes are
arbitrary, so this is not good. For example a curve like this
|
\_ would get widened by 1 curve per side (which is good and optimal), but
if we rotate this curve by, say, 30 degrees it would be widened by 2 curves
per side.
It also doesn't handle cusps and locations of high curvature very well (although
I think the latter is a numerical issue that could be made better by using
doubles).

Regards,
Denis.

- Jim Grahamjames.gra...@oracle.com  wrote:


Hi Denis,

On 8/23/2010 4:18 PM, Denis Lila wrote:

  To widen cubic curves, I use a cubic spline with a fixed number

of curves for

each curve to be widened. This was meant to be temporary, until I

could find a

better algorithm for determining the number of curves in the spline,

but I

discovered today that that won't do it.
  For example, the curve p.moveTo(0,0),p.curveTo(84.0, 62.0,

32.0, 34.0, 28.0, 5.0)

looks bad all the way up to ~200 curves. Obviously, this is

unacceptable.


It would be great if anyone has any better ideas for how to go about

this.

To me it seems like the problem is that in the webrev I chop up the

curve to be

interpolated at equal intervals of the parameter.




Perhaps looking at the rate of change of the slope (2nd and/or 3rd
derivatives) would help to figure out when a given section of curve
can
be approximated with a parallel version?

I believe that the BasicStroke class of Apache Harmony returns widened

curves, but when I tested it it produced a lot more curves than Ductus

(still, probably a lot less than the lines that would be produced by
flattening) and it had some numerical problems.  In the end I decided
to
leave our Ductus stuff in there until someone could come up with a
more
reliable Open Source replacement, but hopefully that code is close
enough to be massaged into working order.

You can search the internet for parallel curves and find lots of
literature on the subject.  I briefly looked through the web sites,
but
didn't have enough time to remember enough calculus and trigonometry
to
garner a solution out of it all with the time that I had.  Maybe
you'll
have better luck following the algorithms... ;-)

As far as flattening at the lowest level when doing scanline
conversion,
I like the idea of using forward differencing as it can create an
algorithm that doesn't require all of the intermediate storage that a

subdividing flattener requires.  One reference that describes the
technique is on Dr. Dobbs site, though I imagine there are many
others:

http://www.drdobbs.com/184403417;jsessionid=O5N5MDJRDMIKHQE1GHOSKH4ATMY32JVN

You can also look at the code in
src/share/native/sun/java2d/pipe/ProcessPath.c for some examples of
forward differencing in use 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-08-24 Thread Denis Lila
Hello again.

 It would be great if anyone has any better ideas for how to go about this.
 To me it seems like the problem is that in the webrev I chop up the curve to 
 be
 interpolated at equal intervals of the parameter.

I implemented another version that detects where either dx/dt or dy/dt is 0, and
splits the curve there. This works pretty well for all the curves I've tested 
(except ones containing cusps, or something close to a cusp, like 
p.moveTo(0,0);
p.curveTo(100,100,0,100,100,0);). Best of all, this:

 For example, the curve p.moveTo(0,0),p.curveTo(84.0, 62.0, 32.0, 34.0, 28.0, 
 5.0)

is no longer a problem.

There is another problem with this method (other than the cusps, which can 
probably
be handled easily as a special case): it is axis-dependent. Ideally, it 
shouldn't be
because the optimal subdivisions of a curve are at values of t that do not 
change
under rotations and translations, but with this method, the t values at which I 
split
the curve do change.

A better way would take into account curve flatness instead of changes in x or y
direction.

Thanks,
Denis.



Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-08-24 Thread Jim Graham

Hi Denis,

On 8/23/2010 4:18 PM, Denis Lila wrote:

 To widen cubic curves, I use a cubic spline with a fixed number of curves 
for
each curve to be widened. This was meant to be temporary, until I could find a
better algorithm for determining the number of curves in the spline, but I
discovered today that that won't do it.
 For example, the curve p.moveTo(0,0),p.curveTo(84.0, 62.0, 32.0, 34.0, 
28.0, 5.0)
looks bad all the way up to ~200 curves. Obviously, this is unacceptable.

It would be great if anyone has any better ideas for how to go about this.
To me it seems like the problem is that in the webrev I chop up the curve to be
interpolated at equal intervals of the parameter.


I think a more dynamic approach that looked at how regular the curve 
was would do better.  Regular is hard to define, but for instance a 
bezier section of a circle could have parallel curves computed very 
easily without having to flatten or subdivide further.  Curves with 
inflections probably require subdividing to get an accurate parallel curve.


Perhaps looking at the rate of change of the slope (2nd and/or 3rd 
derivatives) would help to figure out when a given section of curve can 
be approximated with a parallel version?


I believe that the BasicStroke class of Apache Harmony returns widened 
curves, but when I tested it it produced a lot more curves than Ductus 
(still, probably a lot less than the lines that would be produced by 
flattening) and it had some numerical problems.  In the end I decided to 
leave our Ductus stuff in there until someone could come up with a more 
reliable Open Source replacement, but hopefully that code is close 
enough to be massaged into working order.


You can search the internet for parallel curves and find lots of 
literature on the subject.  I briefly looked through the web sites, but 
didn't have enough time to remember enough calculus and trigonometry to 
garner a solution out of it all with the time that I had.  Maybe you'll 
have better luck following the algorithms... ;-)


As far as flattening at the lowest level when doing scanline conversion, 
I like the idea of using forward differencing as it can create an 
algorithm that doesn't require all of the intermediate storage that a 
subdividing flattener requires.  One reference that describes the 
technique is on Dr. Dobbs site, though I imagine there are many others:


http://www.drdobbs.com/184403417;jsessionid=O5N5MDJRDMIKHQE1GHOSKH4ATMY32JVN

You can also look at the code in 
src/share/native/sun/java2d/pipe/ProcessPath.c for some examples of 
forward differencing in use (and that code also has similar techniques 
to what you did to first chop the path into vertical pieces).  BUT 
(*Nota Bene*), I must warn you that the geometry of the path is somewhat 
perturbed in that code since it tries to combine path normalization 
and rasterization into a single process.  It's not rendering pure 
geometry, it is rendering tweaked geometry which tries to make non-AA 
circles look round and other such aesthetics-targeted impurities.  While 
the code does have good examples of how to set up and evaluate forward 
differencing equations, don't copy too many of the details or you might 
end up copying some of the tweaks and the results will look strange 
under AA.  The Dr. Dobbs article should be your numerical reference and 
that reference code a practical, but incompatible, example...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-08-24 Thread Jim Graham

Hi Denis,

On 8/24/2010 3:35 PM, Jim Graham wrote:

As far as flattening at the lowest level when doing scanline conversion,
I like the idea of using forward differencing as it can create an
algorithm that doesn't require all of the intermediate storage that a
subdividing flattener requires. One reference that describes the
technique is on Dr. Dobbs site, though I imagine there are many others:

http://www.drdobbs.com/184403417;jsessionid=O5N5MDJRDMIKHQE1GHOSKH4ATMY32JVN


Just to provide a basic overview...

You can iterate a line with a constant delta-T using:

x += dx;
y += dy;

Similarly, you can iterate a quad curve with a constant delta-T using:

dx += ddx;
x  += dx;
dy += ddy;
y  += dy;

and a cubic using:

ddx += dddx;
dx  += ddx;
x   += dx;
ddy += dddy;
dy  += ddy;
y   += dy;

There are then techniques to apply to evaluate the dd[d]x and dd[d]y to 
see if the curve is flat enough for your particular needs.  If it 
isn't flat enough, then some simple math performed on the d* variables 
can double or halve the sampling rate for a localized portion of the 
curve.  Once you pass the curvy section, you can then reduce the 
sampling rate again by examining the d* variables.


Done right, this could probably be integrated at the innermost loop of 
the renderer to reduce its storage requirements for curves, but that 
would mean the inner loop would have to switch on the curve type to 
determine which sampling equations apply (though you could simply have 
quads have ddd[xy] = 0 and lines have dd[d][xy] = 0 and use a single set 
of code perhaps without too much performance impact).  Otherwise, this 
could simply be used to flatten and produce edges with less intermediate 
storage (and faster hopefully)...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-08-23 Thread Denis Lila
Hello.

So, I've been working on removing flattening from pisces (except for AA).
The following compiles and sort of works:
http://icedtea.classpath.org/~dlila/webrevs/noflatten/webrev/

The changes to the AA renderer are small, and for dashing I used the 
algorithm
in your last e-mail. That has worked well so far. Actually, dashing is a bit 
more complicated, since that algorithm computes the arc length given the 
parameter
t. For dashing we needed the inverse of this function. Nevertheless, I found
something and dashing seems to work. For now I would like someone to take a look
at just Stroker.
To widen cubic curves, I use a cubic spline with a fixed number of curves 
for
each curve to be widened. This was meant to be temporary, until I could find a 
better algorithm for determining the number of curves in the spline, but I 
discovered today that that won't do it.
For example, the curve p.moveTo(0,0),p.curveTo(84.0, 62.0, 32.0, 34.0, 
28.0, 5.0)
looks bad all the way up to ~200 curves. Obviously, this is unacceptable.

It would be great if anyone has any better ideas for how to go about this.
To me it seems like the problem is that in the webrev I chop up the curve to be
interpolated at equal intervals of the parameter.

Thanks,
Denis.

- Jim Graham james.gra...@oracle.com wrote:

 Denis Lila wrote:
  Hi Jim.
  
  I think the first version is a better choice for now since you said
 that 
  the performance difference isn't noticeable.  I think the lower
 level 
  flattening might look a little different if we ever decide to
 upgrade 
  the pipeline to deal with curves.  In particular, you are still 
  flattening above the dashing/stroking code and I think the
 flattening 
  should be done below that code (i.e. in Renderer).
  
  Wouldn't we still need to flatten for dashing? Is there some way
 to
  quickly compute the arc length of a bezier curve from t=0 to
 t=some_number?
  As far as I can see the function for this computation is the
 integral of
  sqrt(polynomial_of_degree_4), and that would be pretty nasty.
  Or maybe we can get around this somehow?
 
 There should be.  Google turns up a few hits for compute arc length
 for 
 bezier curve that should be enlightening.
 
 BTW, have you looked at a widened dashed curved path with the closed 
 JDK?  I'm pretty sure it outputs dashed curves which proves the
 point.
 
 I have also computed these lengths for other projects (the shape 
 morphing used in some JavaOne demos and Java FX) using the following 
 process:
 
 - Compute the length of the control polynomial.
 - Compute the length of the line between the endpoints.
 - When they are within epsilon return the average as the arc
 length.
 - Otherwise subdivide and try again.
 
 I think you could also do something that looked at the relative angles
 
 of all of the control segments and if they are close enough to each 
 other then you can compute the arc length using a simplified equation
 or 
 simply empirically match this to the close enough rule as above.
 
   ...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-08-11 Thread Denis Lila
Hi Jim.

 I think the first version is a better choice for now since you said that 
 the performance difference isn't noticeable.  I think the lower level 
 flattening might look a little different if we ever decide to upgrade 
 the pipeline to deal with curves.  In particular, you are still 
 flattening above the dashing/stroking code and I think the flattening 
 should be done below that code (i.e. in Renderer).

Wouldn't we still need to flatten for dashing? Is there some way to
quickly compute the arc length of a bezier curve from t=0 to t=some_number?
As far as I can see the function for this computation is the integral of
sqrt(polynomial_of_degree_4), and that would be pretty nasty.
Or maybe we can get around this somehow?

 - You indent by 8 spaces in a few places.  Is that a tabs vs. spaces 
 issue?  We try to stick to 4 space indentations with no tabs for 
 consistentcy.

Yes it is. Sorry about this. Eclipse is completely ignoring my replace
tabs with spaces option.

Thanks,
Denis.


- Jim Graham james.gra...@oracle.com wrote:

 Hi Denis,
 
 So, I'd go with the first one with the following comments:
 
 - I'd make the internal error message a little less personal. ;-) 
 normalization not needed in OFF mode or something.
 
 - lines 362,363 - you don't need to set cur_adjust variables here,
 they are already being set below.
 
 Other than that, it looks good to go...
 
   ...jim
 
 Denis Lila wrote:
  Hi Jim.
  
  So, I have the nicer webrevs.
  FlatteningIterator version:
 
 http://icedtea.classpath.org/~dlila/webrevs/fpWithStrokeControl/webrev/
  
  Pisces flattening version:
 
 http://icedtea.classpath.org/~dlila/webrevs/fpWithSCandPiscesFlattening/webrev/
  
  I dealt with the issue of handling OFF by just not accepting it as
 an input.
  After all, a normalizing iterator only needs to be created, and is
 only created
  if the normalization mode is not OFF.
  
  Thanks,
  Denis.
  
  - Jim Graham james.gra...@oracle.com wrote:
  
  Hi Denis,
 
  I'll wait for some clean webrevs once you get the float stuff in
 for a
 
  final review.  I did take a really quick look and thought that a
  better 
  way to handle OFF would be to set rval to -1 and then check rval
 
  0 
  as the (quicker) test for OFF in the currentSegment() method. 
 Does
  that 
  make sense?
 
  In any case, let's wait for cleaner webrevs to go further on this 
  (hopefully in a day or so?)...
 
 ...jim
 
  On 8/5/2010 8:06 AM, Denis Lila wrote:
  Hi Jim.
 
  I made all the suggested changes.
  Links:
 
 
 http://icedtea.classpath.org/~dlila/webrevs/fpWithStrokeControl/webrev/
 
 http://icedtea.classpath.org/~dlila/webrevs/fpWithSCandPiscesFlattening/webrev/
  Thanks,
  Denis.
 
  - Jim Grahamjames.gra...@oracle.com  wrote:
 
  Hi Denis,
 
  First, comments on the high level normalizer (Normalizing
  iterator):
  - If there is no normalization going on, I would use the Shape's
  own
  flattening (i.e. getPathIterator(at, flat)).  The reason being
  that
  some
  shapes may know how to flatten themselves better, or faster,
 than
  a
  Flattening Iterator.  In particular, rectangles and polygons
 would
  simply ignore the argument and save themselves the cost of
  wrapping
  with
  an extra iterator.  This would probably only be a big issue for
  very
  long Polygons.
 
  - Line 331 - the initializations to NaN aren't necessary as far
 as
  I
  can
  tell...?
 
  - Rather than saving mode in the normalizing iterator, how
 about
  saving 2 constants: (0.0, 0.5) for AA and (0.25, 0.25) for
 non-AA
  and
  then simply add those constants in rather than having to have
 the
  conditional with the 2 different equations?
 
   ...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-08-11 Thread Jim Graham

Denis Lila wrote:

Hi Jim.

I think the first version is a better choice for now since you said that 
the performance difference isn't noticeable.  I think the lower level 
flattening might look a little different if we ever decide to upgrade 
the pipeline to deal with curves.  In particular, you are still 
flattening above the dashing/stroking code and I think the flattening 
should be done below that code (i.e. in Renderer).


Wouldn't we still need to flatten for dashing? Is there some way to
quickly compute the arc length of a bezier curve from t=0 to t=some_number?
As far as I can see the function for this computation is the integral of
sqrt(polynomial_of_degree_4), and that would be pretty nasty.
Or maybe we can get around this somehow?


There should be.  Google turns up a few hits for compute arc length for 
bezier curve that should be enlightening.


BTW, have you looked at a widened dashed curved path with the closed 
JDK?  I'm pretty sure it outputs dashed curves which proves the point.


I have also computed these lengths for other projects (the shape 
morphing used in some JavaOne demos and Java FX) using the following 
process:


- Compute the length of the control polynomial.
- Compute the length of the line between the endpoints.
- When they are within epsilon return the average as the arc length.
- Otherwise subdivide and try again.

I think you could also do something that looked at the relative angles 
of all of the control segments and if they are close enough to each 
other then you can compute the arc length using a simplified equation or 
simply empirically match this to the close enough rule as above.


...jim



Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-08-10 Thread Denis Lila
Hi Jim.

So, I have the nicer webrevs.
FlatteningIterator version:
http://icedtea.classpath.org/~dlila/webrevs/fpWithStrokeControl/webrev/

Pisces flattening version:
http://icedtea.classpath.org/~dlila/webrevs/fpWithSCandPiscesFlattening/webrev/

I dealt with the issue of handling OFF by just not accepting it as an input.
After all, a normalizing iterator only needs to be created, and is only created
if the normalization mode is not OFF.

Thanks,
Denis.

- Jim Graham james.gra...@oracle.com wrote:

 Hi Denis,
 
 I'll wait for some clean webrevs once you get the float stuff in for a
 
 final review.  I did take a really quick look and thought that a
 better 
 way to handle OFF would be to set rval to -1 and then check rval 
 0 
 as the (quicker) test for OFF in the currentSegment() method.  Does
 that 
 make sense?
 
 In any case, let's wait for cleaner webrevs to go further on this 
 (hopefully in a day or so?)...
 
   ...jim
 
 On 8/5/2010 8:06 AM, Denis Lila wrote:
  Hi Jim.
 
  I made all the suggested changes.
  Links:
 
 http://icedtea.classpath.org/~dlila/webrevs/fpWithStrokeControl/webrev/
 
 http://icedtea.classpath.org/~dlila/webrevs/fpWithSCandPiscesFlattening/webrev/
 
  Thanks,
  Denis.
 
  - Jim Grahamjames.gra...@oracle.com  wrote:
 
  Hi Denis,
 
  First, comments on the high level normalizer (Normalizing
 iterator):
 
  - If there is no normalization going on, I would use the Shape's
 own
  flattening (i.e. getPathIterator(at, flat)).  The reason being
 that
  some
  shapes may know how to flatten themselves better, or faster, than
 a
  Flattening Iterator.  In particular, rectangles and polygons would
  simply ignore the argument and save themselves the cost of
 wrapping
  with
  an extra iterator.  This would probably only be a big issue for
 very
  long Polygons.
 
  - Line 331 - the initializations to NaN aren't necessary as far as
 I
  can
  tell...?
 
  - Rather than saving mode in the normalizing iterator, how about
  saving 2 constants: (0.0, 0.5) for AA and (0.25, 0.25) for non-AA
 and
 
  then simply add those constants in rather than having to have the
  conditional with the 2 different equations?
 
 ...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-08-10 Thread Jim Graham

Hi Denis,

I think the first version is a better choice for now since you said that 
the performance difference isn't noticeable.  I think the lower level 
flattening might look a little different if we ever decide to upgrade 
the pipeline to deal with curves.  In particular, you are still 
flattening above the dashing/stroking code and I think the flattening 
should be done below that code (i.e. in Renderer).


So, I'd go with the first one with the following comments:

- You indent by 8 spaces in a few places.  Is that a tabs vs. spaces 
issue?  We try to stick to 4 space indentations with no tabs for 
consistentcy.


- I'd make the internal error message a little less personal. ;-) 
normalization not needed in OFF mode or something.


- lines 362,363 - you don't need to set cur_adjust variables here, they 
are already being set below.


Other than that, it looks good to go...

...jim

Denis Lila wrote:

Hi Jim.

So, I have the nicer webrevs.
FlatteningIterator version:
http://icedtea.classpath.org/~dlila/webrevs/fpWithStrokeControl/webrev/

Pisces flattening version:
http://icedtea.classpath.org/~dlila/webrevs/fpWithSCandPiscesFlattening/webrev/

I dealt with the issue of handling OFF by just not accepting it as an input.
After all, a normalizing iterator only needs to be created, and is only created
if the normalization mode is not OFF.

Thanks,
Denis.

- Jim Graham james.gra...@oracle.com wrote:


Hi Denis,

I'll wait for some clean webrevs once you get the float stuff in for a

final review.  I did take a really quick look and thought that a
better 
way to handle OFF would be to set rval to -1 and then check rval 
0 
as the (quicker) test for OFF in the currentSegment() method.  Does
that 
make sense?


In any case, let's wait for cleaner webrevs to go further on this 
(hopefully in a day or so?)...


...jim

On 8/5/2010 8:06 AM, Denis Lila wrote:

Hi Jim.

I made all the suggested changes.
Links:


http://icedtea.classpath.org/~dlila/webrevs/fpWithStrokeControl/webrev/
http://icedtea.classpath.org/~dlila/webrevs/fpWithSCandPiscesFlattening/webrev/

Thanks,
Denis.

- Jim Grahamjames.gra...@oracle.com  wrote:


Hi Denis,

First, comments on the high level normalizer (Normalizing

iterator):

- If there is no normalization going on, I would use the Shape's

own

flattening (i.e. getPathIterator(at, flat)).  The reason being

that

some
shapes may know how to flatten themselves better, or faster, than

a

Flattening Iterator.  In particular, rectangles and polygons would
simply ignore the argument and save themselves the cost of

wrapping

with
an extra iterator.  This would probably only be a big issue for

very

long Polygons.

- Line 331 - the initializations to NaN aren't necessary as far as

I

can
tell...?

- Rather than saving mode in the normalizing iterator, how about
saving 2 constants: (0.0, 0.5) for AA and (0.25, 0.25) for non-AA

and

then simply add those constants in rather than having to have the
conditional with the 2 different equations?

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-08-05 Thread Jim Graham

Hi Denis,

I'll wait for some clean webrevs once you get the float stuff in for a 
final review.  I did take a really quick look and thought that a better 
way to handle OFF would be to set rval to -1 and then check rval  0 
as the (quicker) test for OFF in the currentSegment() method.  Does that 
make sense?


In any case, let's wait for cleaner webrevs to go further on this 
(hopefully in a day or so?)...


...jim

On 8/5/2010 8:06 AM, Denis Lila wrote:

Hi Jim.

I made all the suggested changes.
Links:
http://icedtea.classpath.org/~dlila/webrevs/fpWithStrokeControl/webrev/
http://icedtea.classpath.org/~dlila/webrevs/fpWithSCandPiscesFlattening/webrev/

Thanks,
Denis.

- Jim Grahamjames.gra...@oracle.com  wrote:


Hi Denis,

First, comments on the high level normalizer (Normalizing iterator):

- If there is no normalization going on, I would use the Shape's own
flattening (i.e. getPathIterator(at, flat)).  The reason being that
some
shapes may know how to flatten themselves better, or faster, than a
Flattening Iterator.  In particular, rectangles and polygons would
simply ignore the argument and save themselves the cost of wrapping
with
an extra iterator.  This would probably only be a big issue for very
long Polygons.

- Line 331 - the initializations to NaN aren't necessary as far as I
can
tell...?

- Rather than saving mode in the normalizing iterator, how about
saving 2 constants: (0.0, 0.5) for AA and (0.25, 0.25) for non-AA and

then simply add those constants in rather than having to have the
conditional with the 2 different equations?

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-08-04 Thread Denis Lila
Hello Jim.

So, I've now implemented both suggestions you had for implementing
stroke control: an intermediate normalizing path iterator, and 
doing flattening in pisces at the lowest level. Respectively,
the webrevs are:
http://icedtea.classpath.org/~dlila/webrevs/fpWithStrokeControl/webrev/
http://icedtea.classpath.org/~dlila/webrevs/fpWithSCandPiscesFlattening/webrev/

Again, these include the floating point conversion changes, but all of the
changes relevant to this e-mail are in PiscesRenderingEngine.java.

As for performance, the version with low level flattening is faster, but this
is only noticeable when pisces is run on it's own i.e. not actually drawing
anything (by the way, I've included a commented out main method in the second
webrev. This allowed me to run pisces on it's own which was useful for debugging
and performance testing). 
However, when drawing stuff, whatever happens after pisces takes up so much
time that it's hard to tell the difference.

Nevertheless, it might be worth to keep the somewhat ugly, low level version.
It might be useful for antialiasing if Stroker moves to outputting curves 
instead
of just lines (because AA still needs to flatten, unless someone comes up with 
an
algorithm to do it without flattening, but I can't think of anything).

I'm sorry to ask you specifically for all these reviews - you've already spent
a lot of time looking at my work, but no one else has replied to any of my 
pisces
inquiries (except Clemens Eisserer, on this issue). Anyway, I would appreciate 
it
if you, or anyone, could take a look at one or both of the above webrevs.

Or maybe we could leave it for when the floating point conversion has been 
pushed.
Then the webrevs would have a lot less clutter.

Thanks,
Denis.

- Denis Lila dl...@redhat.com wrote:

 Hi Jim.
 
 I implemented STROKE_CONTROL today. I used the intermediate
 NormalizingPathIterator, instead of implementing flattening in
 pisces, because I wanted to get something working asap, and this
 would be the easiest way.
 
 The webrev is
 http://icedtea.classpath.org/~dlila/webrevs/fpWithStrokeControl/webrev/
 
 I guess I'm not asking that you take a look at it, because it's
 probably
 not the way we're going to end up doing things, but I wrote it, so
 I'm sending the link just in case anyone wants to see it.
 
 The webrev is big because it includes the floating point conversion,
 but all the
 STROKE_CONTROL changes are in PiscesRenderingEngine.java.
 
 Thanks,
 Denis.
 
 - Jim Graham james.gra...@oracle.com wrote:
 
  Hi Denis,
 
  It would be ill-advised to normalize the coordinates after
 flattening.
 
  The quality would be really bad.
 
  Perhaps this is a good reason to start looking at updating Pisces to
  take curves and flatten at the lowest level?
 
  Or, I suppose we could get a non-flattened iterator from the source
  path, send it through a normalizing filter, and then through a
  flattening filter (the way many of the existing objects do
 flattening
  is
  to just get their regular iterator and run it through an instance of
  FlatteningPathIterator so we could do this manually with an
  intervening
  NormalizingPathIterator if normalization is needed)...
 
  ...jim
 
  Denis Lila wrote:
   Hello Jim.
  
   Thanks for that. I'll get to work on implementing it.
  
   One thing though, about normalizing the control points of bezier
   curves: pisces never gets any bezier curves as input. It only gets
   lines that are the product of flattening bezier curves.
  
   Pisces receives its input from flattening path iterators which get
  it
   from other path iterators. Of course we can't require these to
 send
  out
   normalized points. In order to normalize the control points we
 need
  to
   be able to look at the bezier curves in Pisces, so we can't just
  take
   all the input from the flattener. However, pisces can't handle
  curves
   (yet, hopefully), so after the normalization, they must be
  flattened, and
   this is the problem. I think it's a pretty good idea to do this by
   storing the input form the iterator into pisces (after
  normalization),
   creating a nameless path iterator that just iterates through all
  that,
   and using this iterator to create a flattening iterator, which
 then
   is used as before.
  
   Does anyone have any other ideas?
  
   Thank you,
   Denis.
  
  
   - Jim Graham james.gra...@oracle.com wrote:
  
   For AA this is exactly what we do (round to nearest pixel centers
  for
  
   strokes).  Note that this is done prior to any line widening code
  is
   executed.
  
   For non-AA we normalize coordinates to, I believe the (0.25,
 0.25)
 
   sub-pixel location.  This is so that the transitions between
  widening
   of
   lines occurs evenly (particularly for horizontal and vertical
 wide
 
   lines).  If you round to pixel edges then you have the following
   progression (note that the line width grows by half on either
 side
  of
  
   the original 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-08-04 Thread Jim Graham

Hi Denis,

First, comments on the high level normalizer (Normalizing iterator):

- If there is no normalization going on, I would use the Shape's own 
flattening (i.e. getPathIterator(at, flat)).  The reason being that some 
shapes may know how to flatten themselves better, or faster, than a 
Flattening Iterator.  In particular, rectangles and polygons would 
simply ignore the argument and save themselves the cost of wrapping with 
an extra iterator.  This would probably only be a big issue for very 
long Polygons.


- Line 331 - the initializations to NaN aren't necessary as far as I can 
tell...?


- Rather than saving mode in the normalizing iterator, how about 
saving 2 constants: (0.0, 0.5) for AA and (0.25, 0.25) for non-AA and 
then simply add those constants in rather than having to have the 
conditional with the 2 different equations?


...jim



Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-08-02 Thread Denis Lila
Hi Jim.

I implemented STROKE_CONTROL today. I used the intermediate 
NormalizingPathIterator, instead of implementing flattening in
pisces, because I wanted to get something working asap, and this
would be the easiest way.

The webrev is 
http://icedtea.classpath.org/~dlila/webrevs/fpWithStrokeControl/webrev/

I guess I'm not asking that you take a look at it, because it's probably
not the way we're going to end up doing things, but I wrote it, so
I'm sending the link just in case anyone wants to see it.

The webrev is big because it includes the floating point conversion, but all the
STROKE_CONTROL changes are in PiscesRenderingEngine.java.

Thanks,
Denis.

- Jim Graham james.gra...@oracle.com wrote:

 Hi Denis,
 
 It would be ill-advised to normalize the coordinates after flattening.
 
 The quality would be really bad.
 
 Perhaps this is a good reason to start looking at updating Pisces to 
 take curves and flatten at the lowest level?
 
 Or, I suppose we could get a non-flattened iterator from the source 
 path, send it through a normalizing filter, and then through a 
 flattening filter (the way many of the existing objects do flattening
 is 
 to just get their regular iterator and run it through an instance of 
 FlatteningPathIterator so we could do this manually with an
 intervening 
 NormalizingPathIterator if normalization is needed)...
 
   ...jim
 
 Denis Lila wrote:
  Hello Jim.
  
  Thanks for that. I'll get to work on implementing it.
  
  One thing though, about normalizing the control points of bezier
  curves: pisces never gets any bezier curves as input. It only gets
  lines that are the product of flattening bezier curves.
  
  Pisces receives its input from flattening path iterators which get
 it
  from other path iterators. Of course we can't require these to send
 out
  normalized points. In order to normalize the control points we need
 to
  be able to look at the bezier curves in Pisces, so we can't just
 take
  all the input from the flattener. However, pisces can't handle
 curves
  (yet, hopefully), so after the normalization, they must be
 flattened, and
  this is the problem. I think it's a pretty good idea to do this by 
  storing the input form the iterator into pisces (after
 normalization), 
  creating a nameless path iterator that just iterates through all
 that,
  and using this iterator to create a flattening iterator, which then
  is used as before.
  
  Does anyone have any other ideas?
  
  Thank you,
  Denis.
  
  
  - Jim Graham james.gra...@oracle.com wrote:
  
  For AA this is exactly what we do (round to nearest pixel centers
 for
 
  strokes).  Note that this is done prior to any line widening code
 is 
  executed.
 
  For non-AA we normalize coordinates to, I believe the (0.25, 0.25)
 
  sub-pixel location.  This is so that the transitions between
 widening
  of 
  lines occurs evenly (particularly for horizontal and vertical wide
 
  lines).  If you round to pixel edges then you have the following 
  progression (note that the line width grows by half on either side
 of
 
  the original geometry so you have to consider the line widths
 where
 
  you encounter the pixel centers to your left and right (or above
 and 
  below) which govern when that column (or row) of pixels first
 turns
  on):
 
  width 0.00 = 0.99  nothing drawn (except we kludge this)
  width 1.00 = 1.00  1 pixel wide (col to left turns on)
  width 1.01 = 2.99  2 pixels wide (col to right turns on)
  width 3.00 = 3.00  3 pixels wide (etc.)
  width 3.01 = 4.99  4 pixels wide
 
  Note that it is nearly impossible to get an odd-width line.  You 
  basically have to have exactly an integer width to get an odd-width
 
  line.  This is because at the odd widths you reach the half pixel
 
  locations on both sides of the line at the same time.  Due to the 
  half-open insideness rules only one of the pixels will be chosen
 to
  be 
  inside this path.  Just below these sizes and you fail to hit
 either 
  pixel center.  Just at the integer size you reach both pixel
 centers
  at 
  the same time.  Just slightly larger than that width and now you've
 
  fully enclosed both pixel centers and the line width has to
 increase
  by 
  nearly 2.0 until you reach the next pixel centers.
 
  (The kludge I talk about above is that we set a minimum pen width
 so 
  that we never fail to draw a line even if the line width is set to
  0.0, 
  but the above table was a theoretical description of the absolute
  rules.)
 
  If we rounded them to pixel centers, then the transitions look
 like
  this:
 
  width 0.00 = 0.00  nothing drawn (modulo kludge)
  width 0.01 = 1.99  1 pixel wide (column you are in turns on)
  width 2.00 = 2.00  2 pixels wide (column to left turns on)
  width 2.01 = 3.99  3 pixels wide (column to right turns on)
  width 4.00 = 4.00  4 pixels wide (etc.)
  width 4.01 = 5.99  5 pixels wide
 
  We have a similar effect as 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-07-12 Thread Denis Lila
Hello Jim.

I think the second way would be better because there would be no
repeated code, and it would be easier to implement.
Do you think there will be any performance benefit from the first way?

Regards,
Denis.

- Jim Graham james.gra...@oracle.com wrote:

 Hi Denis,
 
 It would be ill-advised to normalize the coordinates after flattening.
 
 The quality would be really bad.
 
 Perhaps this is a good reason to start looking at updating Pisces to 
 take curves and flatten at the lowest level?
 
 Or, I suppose we could get a non-flattened iterator from the source 
 path, send it through a normalizing filter, and then through a 
 flattening filter (the way many of the existing objects do flattening
 is 
 to just get their regular iterator and run it through an instance of 
 FlatteningPathIterator so we could do this manually with an
 intervening 
 NormalizingPathIterator if normalization is needed)...
 
   ...jim
 
 Denis Lila wrote:
  Hello Jim.
  
  Thanks for that. I'll get to work on implementing it.
  
  One thing though, about normalizing the control points of bezier
  curves: pisces never gets any bezier curves as input. It only gets
  lines that are the product of flattening bezier curves.
  
  Pisces receives its input from flattening path iterators which get
 it
  from other path iterators. Of course we can't require these to send
 out
  normalized points. In order to normalize the control points we need
 to
  be able to look at the bezier curves in Pisces, so we can't just
 take
  all the input from the flattener. However, pisces can't handle
 curves
  (yet, hopefully), so after the normalization, they must be
 flattened, and
  this is the problem. I think it's a pretty good idea to do this by 
  storing the input form the iterator into pisces (after
 normalization), 
  creating a nameless path iterator that just iterates through all
 that,
  and using this iterator to create a flattening iterator, which then
  is used as before.
  
  Does anyone have any other ideas?
  
  Thank you,
  Denis.
  
  
  - Jim Graham james.gra...@oracle.com wrote:
  
  For AA this is exactly what we do (round to nearest pixel centers
 for
 
  strokes).  Note that this is done prior to any line widening code
 is 
  executed.
 
  For non-AA we normalize coordinates to, I believe the (0.25, 0.25)
 
  sub-pixel location.  This is so that the transitions between
 widening
  of 
  lines occurs evenly (particularly for horizontal and vertical wide
 
  lines).  If you round to pixel edges then you have the following 
  progression (note that the line width grows by half on either side
 of
 
  the original geometry so you have to consider the line widths
 where
 
  you encounter the pixel centers to your left and right (or above
 and 
  below) which govern when that column (or row) of pixels first
 turns
  on):
 
  width 0.00 = 0.99  nothing drawn (except we kludge this)
  width 1.00 = 1.00  1 pixel wide (col to left turns on)
  width 1.01 = 2.99  2 pixels wide (col to right turns on)
  width 3.00 = 3.00  3 pixels wide (etc.)
  width 3.01 = 4.99  4 pixels wide
 
  Note that it is nearly impossible to get an odd-width line.  You 
  basically have to have exactly an integer width to get an odd-width
 
  line.  This is because at the odd widths you reach the half pixel
 
  locations on both sides of the line at the same time.  Due to the 
  half-open insideness rules only one of the pixels will be chosen
 to
  be 
  inside this path.  Just below these sizes and you fail to hit
 either 
  pixel center.  Just at the integer size you reach both pixel
 centers
  at 
  the same time.  Just slightly larger than that width and now you've
 
  fully enclosed both pixel centers and the line width has to
 increase
  by 
  nearly 2.0 until you reach the next pixel centers.
 
  (The kludge I talk about above is that we set a minimum pen width
 so 
  that we never fail to draw a line even if the line width is set to
  0.0, 
  but the above table was a theoretical description of the absolute
  rules.)
 
  If we rounded them to pixel centers, then the transitions look
 like
  this:
 
  width 0.00 = 0.00  nothing drawn (modulo kludge)
  width 0.01 = 1.99  1 pixel wide (column you are in turns on)
  width 2.00 = 2.00  2 pixels wide (column to left turns on)
  width 2.01 = 3.99  3 pixels wide (column to right turns on)
  width 4.00 = 4.00  4 pixels wide (etc.)
  width 4.01 = 5.99  5 pixels wide
 
  We have a similar effect as above, but biased towards making even
 line
 
  widths harder.
 
  So, by locating lines at (0.25, 0.25) subpixel location we end up
 with
  a 
very even progression:
 
  width 0.00 = 0.50  nothing drawn (modulo kludge)
  width 0.51 = 1.50  1 pixel wide (column you are in turns on)
  width 1.51 = 2.50  2 pixel wide (column to left gets added)
  width 2.51 = 3.50  3 pixel wide (column to right gets added)
  width 3.51 = 4.50  4 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-07-07 Thread Jim Graham
For AA this is exactly what we do (round to nearest pixel centers for 
strokes).  Note that this is done prior to any line widening code is 
executed.


For non-AA we normalize coordinates to, I believe the (0.25, 0.25) 
sub-pixel location.  This is so that the transitions between widening of 
lines occurs evenly (particularly for horizontal and vertical wide 
lines).  If you round to pixel edges then you have the following 
progression (note that the line width grows by half on either side of 
the original geometry so you have to consider the line widths where 
you encounter the pixel centers to your left and right (or above and 
below) which govern when that column (or row) of pixels first turns on):


width 0.00 = 0.99  nothing drawn (except we kludge this)
width 1.00 = 1.00  1 pixel wide (col to left turns on)
width 1.01 = 2.99  2 pixels wide (col to right turns on)
width 3.00 = 3.00  3 pixels wide (etc.)
width 3.01 = 4.99  4 pixels wide

Note that it is nearly impossible to get an odd-width line.  You 
basically have to have exactly an integer width to get an odd-width 
line.  This is because at the odd widths you reach the half pixel 
locations on both sides of the line at the same time.  Due to the 
half-open insideness rules only one of the pixels will be chosen to be 
inside this path.  Just below these sizes and you fail to hit either 
pixel center.  Just at the integer size you reach both pixel centers at 
the same time.  Just slightly larger than that width and now you've 
fully enclosed both pixel centers and the line width has to increase by 
nearly 2.0 until you reach the next pixel centers.


(The kludge I talk about above is that we set a minimum pen width so 
that we never fail to draw a line even if the line width is set to 0.0, 
but the above table was a theoretical description of the absolute rules.)


If we rounded them to pixel centers, then the transitions look like this:

width 0.00 = 0.00  nothing drawn (modulo kludge)
width 0.01 = 1.99  1 pixel wide (column you are in turns on)
width 2.00 = 2.00  2 pixels wide (column to left turns on)
width 2.01 = 3.99  3 pixels wide (column to right turns on)
width 4.00 = 4.00  4 pixels wide (etc.)
width 4.01 = 5.99  5 pixels wide

We have a similar effect as above, but biased towards making even line 
widths harder.


So, by locating lines at (0.25, 0.25) subpixel location we end up with a 
 very even progression:


width 0.00 = 0.50  nothing drawn (modulo kludge)
width 0.51 = 1.50  1 pixel wide (column you are in turns on)
width 1.51 = 2.50  2 pixel wide (column to left gets added)
width 2.51 = 3.50  3 pixel wide (column to right gets added)
width 3.51 = 4.50  4 pixel wide (etc.)

This gives us nice even and gradual widening of the lines as we increase 
the line width by sub-pixel amounts and the line widths are fairly 
stable around integer widths.


Also, note that we don't say when stroking as you might want to 
normalize both strokes and fills so that they continue to match.  I 
believe that we normalize both strokes and fills for non-AA and we only 
normalize strokes for AA (and leave AA fills as pure).  AA is less 
problematic with respect to creating gaps if your stroke and fill 
normalization are not consistent.


The rounding equations are along the lines of:

v = Math.floor(v + rval) + aval;

For center of pixel you use (rval=0.0, aval=0.5)
For 0.25,0.25 rounding use  (rval=0.25, aval=0.25)
For edge of pixel you use   (rval=0.5, aval=0.0)

Also, we came up with an interesting way of adjusting the control points 
of quads and cubics if we adjusted their end points, but I don't know if 
what we did was really the best idea.  For quads we adjust the control 
point by the average of the adjustments that we applied to its 2 end 
points.  For cubics, we move the first control point by the same amount 
as we moved the starting endpoint and the second control point by the 
amount we moved the final endpoint.  The jury is out on whether that is 
the most aesthetic technique...


...jim

Denis Lila wrote:

Regarding VALUE_STROKE_NORMALIZE the API says:
Stroke normalization control hint value -- geometry should
be normalized to improve uniformity or spacing of lines and
overall aesthetics. Note that different normalization 
algorithms may be more successful than others for given 
input paths. 


I can only think of one example where VALUE_STROKE_NORMALIZE makes a visible
difference between the closed source implementation and OpenJDK:
when drawing anti-aliased horizontal or vertical lines of width 1, Pisces 
draws a 2 pixel wide line with half intensity (because integer coordinates

are between pixels). Sun's jdk with VALUE_SROKE_NORMALIZE turned on draws
a 1 pixel line with full intensity. This could to achieved by just
checking for normalization and rounding 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-07-07 Thread Jim Graham
The first part means that if the scale is uniform in X and Y 
(AffineTransform has some logic to determine this property in its 
getType() method) then we can use X11 to do line widening by just giving 
it a scaled line width.  Also, X11 is limited to integer line widths so 
we would only want to do this if the SPEED hint is specified (not 
QUALITY) or if the scaled line width was close to an integer.


If we are going to use a software rasterizer to widen the line and then 
send over spans to render, it may be faster to just give X11 the 
original path and a scaled line width and ask it to widen the line. 
Even if it uses a software renderer the reduction in protocol traffic is 
a win, and their rasterizer is probably optimized for integer polygons 
and may likely be faster than our more general curve-handling code.


But, I would rank this low on optimizations at this point...

...jim

Clemens Eisserer wrote:

Hi Denis,


In sun.java2d.x11.X11Renderer, line 340, it says:
   // REMIND: X11 can handle uniform scaled wide lines
   // and dashed lines itself if we set the appropriate
   // XGC attributes (TBD).


Also, it is a known issue that Pisces does not support the STROKE_CONTROL
hint.

I have been wanting to implement these two features, and I have a few questions:
Has anything been decided on the first issue? Do we still want to implement it?
If yes, can anyone give me some rough suggestions as to how I can get started?


Its just my personal opinion, but I would recommend not implementing it.
Xorg falls back to software anyway for anything more complex than
solid rectangles and blits
and those code-paths will only be triggered for non-antialised
rendering with solid colors.

Implementing it in Pisces would help every backend OpenJDK supports :)

Just checked and I also ignore the STROKE_CONTROL stuff completly in
the cairo based Jules rasterizer.
Curious how that could be mapped to Cairo, do you know any more
in-depth explanation how it works - or examples how it should look
like?

Thanks, Clemens


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-07-06 Thread Denis Lila
Clemens, thanks for your reply.

 Its just my personal opinion, but I would recommend not implementing
 it.
 Xorg falls back to software anyway for anything more complex than
 solid rectangles and blits
 and those code-paths will only be triggered for non-antialised
 rendering with solid colors.
 
 Implementing it in Pisces would help every backend OpenJDK supports
 :)

That makes sense. I think you're right.

 
 Just checked and I also ignore the STROKE_CONTROL stuff completly in
 the cairo based Jules rasterizer.
 Curious how that could be mapped to Cairo, do you know any more
 in-depth explanation how it works - or examples how it should look
 like?
 
 Thanks, Clemens

Regarding VALUE_STROKE_NORMALIZE the API says:
Stroke normalization control hint value -- geometry should
be normalized to improve uniformity or spacing of lines and
overall aesthetics. Note that different normalization 
algorithms may be more successful than others for given 
input paths. 

I can only think of one example where VALUE_STROKE_NORMALIZE makes a visible
difference between the closed source implementation and OpenJDK:
when drawing anti-aliased horizontal or vertical lines of width 1, Pisces 
draws a 2 pixel wide line with half intensity (because integer coordinates
are between pixels). Sun's jdk with VALUE_SROKE_NORMALIZE turned on draws
a 1 pixel line with full intensity. This could to achieved by just
checking for normalization and rounding coordinates to the nearest half 
pixel, but this solution seems too simple, and I'm not sure whether I'm missing
anything. It would also probably cause problems when drawing anti-aliased 
short lines (which is done when drawing any sort of curve)
Unless, of course, this rounding was restricted to just horizontal and 
vertical lines.

Regards,
Denis.