Hi Prahalad,

This latest version looks like it should produce correct answers.

I'd like to see benchmark results on more platforms, particularly Windows since the different compilers may optimize these things differently.

Also, you didn't mention what data set you used for benchmarking. I'd like to see benchmark results for small, medium and large font sizes, and possibly bold and italic fonts as well. The reason is that the relative ratios of "empty glyph pixels" to "partial glyph pixels" to "fully covered glyph pixels" changes depending on the font type and size so if you made one of those faster and another slower then the results may be seen as a gain in one type of test if you only test one font type and size and it happens to match the part of the code that is more streamlined. Also, for small font sizes the per-glyph overhead might hide per-pixel issues. Please share which fonts and sizes you used for testing and consider using some different sizes, including something very large like 36 or 48 points (something with large areas of opaque pixels) as well as more normal sizes that appear in GUIs. Also, bold fonts can have a higher percentage of opaque pixels.

In particular...

This latest version is missing the "fast path" from the existing code for the case of srcA == 255 and glyphA == 255 where it just stores the FG_PIXEL values directly. For large fonts/glyphs that case may be as important as detecting empty glyph pixels.

On the other hand, an additional "if" may cause the compiler to generate less efficient code as per Sergey's concern. Since this "if" eliminates some multiplies and possibly some divides, I'd hope it would be a win-win.

You could add the fast path back inside the case where mixValSrc is 255 and just test srcA for 255 and do the Store ## DST ## PixelData() macro that used to be at the end of the block in the existing code, and then use "break;" to escape out of the do/while surrounding the whole macro so it moves on to the next pixel.

(Arguably, we might want to consider teaching our SRCOVER_MASKFILL to do the same thing. I think that was one of the added values of having a separate GLYPH loop, but really both should be optimizing that case...)

I can see now that the macro switch to use the same macro set as SRCOVER_MASKFILL required you to compute the pixel address, as you noted in your summary. It makes the new macro more cumbersome and makes it look like it's doing a bit more work per-pixel, but in reality I think the overall operations end up being the same as long as the compiler optimizes the deliberate multiplications the same way it did for implicit multiplications in the "pRas[foo]" and "pRas[foo*4]" code that was being generated previously. Benchmarks will tell us if we're good there...

                        ...jim


On 3/28/16 5:33 AM, Prahalad Kumar Narayanan wrote:
Hello Everyone on Java2D Group

Good day to you.

This is a follow-up to Review Request on bug:
     Bug          : JDK-8015070
     Bug Link : https://bugs.openjdk.java.net/browse/JDK-8015070

First, Thanks to Jim and Sergey for their feedback on the changes so far.

Inferences from Jim 's Feedback on Loading destination colors:
     1.  The API or Macro that I had earlier used to Load DST colors, indeed, 
adjusted destination color components with divide-by-alpha if destination was 
already pre-multiplied.
                  My apologies.. I should have spotted this error at the first 
iteration itself.
     2.  Due to the divide-by-alpha adjustment, the remaining logic would 
become incorrect. ( Especially, the multiplication with dstF based on 
pre-mulitplication status )
     3.  Correct API is being used now, and the dstColor components are loaded 
directly without any divide-by-alpha adjustment.

Inferences from Sergey's Feedback on Performance.
     1.  To set the context for everyone, the logic present in the current 
webrev.02 is modelled as per SRCOVER_MASKFILL.
                  There are multiple if (...) conditions that remove 
un-necessary blending calculations. Ideally this should improve performance.
                  However, since some data are not readily available (as 
present in SRCOVER_MASKFILL), few additional calculations have been added.
                  They are: pre-multiplying srcColor with alpha and assigning 
to res.
                                      Finding the correct address of Pixel 
using DST_PTR and PixelStride.
                  Henceforth, as Sergey suggests, Observation on performance 
will be beneficial.
     2.  The performance of the new logic was measured with 
linux-x86_64-normal-server-release config and compared with the logic used in 
un-optimized code in webrev.00
     3.  Result: The newer logic provides a fractional gain (about 15 - 20 ms) 
over the older logic.

Other Subtle Changes:
     1.  The test file has been renamed from AADrawStringArtifact.java to more 
meaningful - AntialiasedTextArtifact.java
     2.  The test file tests for both TYPE_INT_ARGB and TYPE_INT_ARGB_PRE 
BufferedImage formats.
                 The code has been well commented to explain the logic used in 
every function.

Kindly take your time to review the changes in the webrev link mentioned below 
and provide your suggestions.
          Webrev Link: 
http://cr.openjdk.java.net/~psadhukhan/prahlad/8015070/webrev.02/


Thank you for your time in review
Have a good day

Prahalad N.


-----Original Message-----
From: Jim Graham
Sent: Thursday, March 24, 2016 7:57 AM
To: Prahalad Kumar Narayanan; 2d-dev@openjdk.java.net
Subject: Re: [OpenJDK 2D-Dev] [2D-Dev] Review Request: JDK-8015070: Antialiased 
text on translucent backgrounds gets bright artifacts

Hi Prahalad,

(On a side note - ouch!  I came up with these macros in the first place, but 20 
years later I'm now realizing just how hard they are to navigate and review.  
My apologies...)

The macro you are still using to load the destination data is one that loads it 
to non-premultiplied components, which means it will divide out the alpha if 
the destination is PRE.  The rest of the logic assumes that the components were 
loaded without any adjustment of their premultiplication so not only is that 
division an unnecessary operation, it makes the following math wrong.

The SRCOVER_MASKFILL macro seems to use "Postload ## STRATEGY ## From ## TYPE" which 
seems to load them into separate components without any adjustment of their pre-multiplication 
status.  This is paired with "LoadAlphaFrom ## TYPE ## For ## STRATEGY" to load just the 
destination alpha for computing dstF...

                        ...jim

On 3/22/16 4:35 AM, Prahalad Kumar Narayanan wrote:
Hello Everyone on Java2D Group

Good day to you.

This is a Follow-up to Review Request on the bug:
      Bug          : JDK-8015070   Anti-aliased Text on Translucent background 
gets bright artifacts
      Bug Link : https://bugs.openjdk.java.net/browse/JDK-8015070

First, Sincere thanks to Jim for his valuable feedback.
      1.  As Jim correctly mentioned, SRCOVER_MASKFILL has a similar logic to 
blend two Translucent colors based on an Alpha mask.
      2.  The calculations are exactly the same as the changes in previous 
webrev.
                  However the logic of SRCOVER_MASKFILL is 'Optimized' to 
reduce the number of computations.
      3.  This optimization is definitely required because, the logic is 
executed for every single pixel in a glyph.
                  Example: If a string is made up of 5 English characters with 
each character having 32 x 32 pixels,
                                     The anti-aliasing logic will be executed 
for 5 x 32 x 32 iterations.
                  Reducing computation per pixel will imply a huge benefit for 
complete drawString operation.

Observation from SRCOVER_MASKFILL
      1.  The mask fill reduces computations by through multiple if(...) 
conditions.
                    Each if condition affirms whether the next set of 
computations are required.
      2.  Optimization 1: If mask value is 0- skip entire logic.
      3.  Optimization 2: If mask value is maximum, say 255, take srcA directly 
without multiplying with maskAlpha ( Reason: 1 * srcAlpha = srcAlpha )
      4.  Optimization 3: Compute pre-multiplied resColor in two steps. First 
with srcColor and then with dstColor.
                    If the resAlpha from 1st step (i.e., srcColor) is fully 
opaque, avoid blending dstColor altogether.

Changes in Current Webrev.01
      1.  The fix for the current bug is modelled based on SRCOVER_MASKFILL.
      2.  The changes have been verified to work on windows, linux and mac 
operating systems.
      3.  The automated Test file- AADrawStringArtifact.java runs as expected
                 Identifies artifact & throws exception when run on JDK 7 and 8.
                 With JDK9, the test file returns without error.
      3.  JPRT build has been run to ensure, changes build on all supported 
platforms.
                  JPRT job link :
http://scaaa637.us.oracle.com//archive/2016/03/2016-03-22-070604.prahn
ara-linux.client

Kindly review the changes in the below mentioned link and provide your views
      Webrev Link :
http://cr.openjdk.java.net/~psadhukhan/prahlad/8015070/webrev.01/


Thank you for your time in review
Have a good day

Prahalad N.


-----Original Message-----
From: Jim Graham
Sent: Friday, March 18, 2016 6:07 AM
To: Prahalad Kumar Narayanan; 2d-dev@openjdk.java.net
Subject: Re: [OpenJDK 2D-Dev] [2D-Dev] Review Request: JDK-8015070:
Antialiased text on translucent backgrounds gets bright artifacts

Hi Prahalad,

This basically boils down to "alpha blending math needs to be performed in 
premultiplied form" and the existing code was not doing that.

Your changes do add a pre-multiplication step to the math in two
places
- when you mix the src alpha and the glyph alpha at the top of the macro, and 
again when you do the Multiply(dst, dstA, dst) step in the middle.  In that 
sense, the new math is correct.

However, it is not the optimal way to implement this.  In particular,
the macro used here to load the data from the destination is the one
that loads it into 4 ARGB non-premultiplied values.  If the
destination is non-PRE, then your added multiply step is exactly what
is needed, but it could be combined with the multiply that will happen
later in the blending equation, so it is an added step rather than a
fixed fraction in the existing MultMultAdd parameters.  Additionally,
if the destination is already PRE, then the macro being used to load
the dst pixel data there will have done a divide step to divide out
the alpha from the destination, only to have you reverse that math
with your new
Multiply() step.  That's a lot of math to end up with a NOP.

The MUL8 you added for the srcA and glyph value is needed, that change is good. 
 Since it is common for glyph pixels to have zero alpha, you might want to test 
the glyph alpha for 0 and skip the pixel before you do the multiply, though.  
This would add one more if, but it would be a common case.

The trickier part is to load the destination components without un-premultiplying them.  
Unfortunately there is no "Load...ToArgbPre"
macro to parallel the Load macro used in the function.  Perhaps there should 
be, but you'd still end up with an extra multiply step as I mentioned above 
because you can fold the premultiplication of the dst data into the MultMultAdd 
by carefully choosing the parameters you use in the math there.

The good news is that the way that the SRCOVER_MASKFILL uses the various 
type-specific macros works around this a bit and minimizes the number of 
multiplies that happen.  You could check out DEFINE_SRCOVER_MASKFILL and see 
how it works in the case where pMask is not null (pMask is an alpha mask with 
values very similar to the glyphAA data).  Modeling this code on that code 
would correct the math and minimize it as well...

                        ...jim

On 3/17/16 3:00 AM, Prahalad Kumar Narayanan wrote:
Hello Everyone on Java2D Group

Good day to you.

Herewith, I 'm sharing the webrev for two identical Java2D Bugs.

Bug ID : JDK-8015070

          Title     : Antialiased text on translucent backgrounds gets
bright artifacts

          Link      : https://bugs.openjdk.java.net/browse/JDK-8015070

Bug ID : JDK-8013564 ( currently closed as duplicate )

          Title     : Font rendering anti-aliasing in translucent
BufferedImages broken

          Link      : https://bugs.openjdk.java.net/browse/JDK-8013564

Webrev Link :
http://cr.openjdk.java.net/~psadhukhan/prahlad/8015070/webrev.00/

Quick Summary on Bugs :

````````````````````````````````````````````````

1.  Artifacts appear on Edges of text characters when anti-aliased
text is drawn on Translucent background

2.  The issue is reproducible on all platforms - windows, linux and mac os.

3.  Besides, the issue is reproduced with the commonly used pixel
format- 4ByteArgb.

Root Cause & Solution :

````````````````````````````````````````````````

1.  The Macro: GlyphListAABlend4ByteArgb in File: LoopMacros.h uses
the standard blending algorithm

           dstColor = [ srcColor * glyphAlpha + dstColor * (1 -
glyphAlpha) ] / dstAlpha

2.  The above equation works only when the srcColor and dstColor are Opaque.

3.  When srcColor and dstColor are Translucent, their alphaComponent
will influence the visibility of the color, and visibility of the
color below.

4.  The new set of calculations for blending Translucent source and
destination colors is given as

           resAlpha = 1 - ((1 - srcAlpha) * (1 - dstAlpha))

           resColor = [ srcColor * srcAlpha + dstColor * dstAlpha * (1
-
srcAlpha) ] / resAlpha

5.  Reference text for the equation:
https://en.wikipedia.org/wiki/Alpha_compositing

6.  With the above modification to the blending logic, the artifacts
do not appear and issues are fixed.

Jtreg & Jprt Results :

````````````````````````````````````````````````

1.  A simple test-file: AADrawStringArtifact.java has been created to
be a part of jtreg test cases.

            The test file is well commented to explain - nature of
artifact and how the test tries to identify them.

            As required, the test case fails with Jdk 7, Jdk 8 and
succeeds with Jdk 9-internal (built with changes for the bug fix)

2.  The changes to blending logic lies within
java.desktop/src/share/native/...

            Henceforth, JPRT was used to ensure successful build
across all supported platforms

            Jprt Job Link :
http://scaaa637.us.oracle.com//archive/2016/03/2016-03-17-072001.prah
n
ara-linux.client

            The build succeeds on all platforms.

Kindly review the webrev link and provide your views and suggestions.

Webrev Link :
http://cr.openjdk.java.net/~psadhukhan/prahlad/8015070/webrev.00/

If the changes look good, we could take up the changes for source checkin.

Thank you for your time in review

Have a good day

Prahalad N.

Reply via email to