[
https://issues.apache.org/jira/browse/PDFBOX-4847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117050#comment-17117050
]
Emmeran Seehuber edited comment on PDFBOX-4847 at 5/26/20, 9:31 PM:
--------------------------------------------------------------------
The bug in the PNGConverter is, that it did not correctly write the ICC
profile. It had a "one off" error, as it did not skip the 0-byte marker in the
profile name (first 0..79 bytes of the iCCP chunk + 0 byte). And it did not
mark the stream as FLATE_DECODE.
PDFBox (and likely all other PDF readers) just ignored the ICC profile because
of this (Exception while decoding the profile). But this meant that the colors
were not correct (as the wrong color profile was used; the alternative
DeviceRGB was used).
The minimal patch would be:
{code:java}
diff --git
a/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PNGConverter.java
b/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PNGConverter.java
index f17cdd7cd..866cfbfba 100644
---
a/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PNGConverter.java
+++
b/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PNGConverter.java
@@ -400,11 +400,15 @@ final class PNGConverter
if (state.iCCP != null || state.sRGB != null)
{
// We have got a color profile, which we must attach
cosStream.setInt(COSName.N, colorSpace.getNumberOfComponents());
cosStream.setItem(COSName.ALTERNATE,
colorSpace.getNumberOfComponents()
== 1 ? COSName.DEVICEGRAY : COSName.DEVICERGB);
if (state.iCCP != null)
{
+ cosStream.setItem(COSName.FILTER, COSName.FLATE_DECODE);
// We need to skip over the name
@@ -415,6 +419,7 @@ final class PNGConverter
break;
iccProfileDataStart++;
}
+ iccProfileDataStart++;
if (iccProfileDataStart >= state.iCCP.length)
{
LOG.error("Invalid iCCP chunk, to few bytes");
{code}
But this will cause test failures in the PNGConverterTest. As the image now has
the right colors, but
- the JDK does not respect the embedded color profile in PNG images. Without
the fix for this in PNGConverterTest the colors will be "miles" off with the
PNG for comparison using ImageIO.
- comparing sRGB images does not work, even after applying the fix for the ICC
profile, because there are some color rounding differences (off by 1 on the
first pixel, for whatever reason, likely some different color conversion paths
somewhere). There is a massive difference between converting single pixel
values between colorspaces and converting a whole image at once (using
ColorConversionOp). The later one may choose slightly different colors
depending on the rendering intent and the colors in use in the image. The image
from PDImage.getImage() would have been ColorConversionOp-converted, but in
checkIdent() using getRGB() the image read with ImageIO would be "pixel by
pixel" color converted. One could fix this by first converting the expected
image using ColorConversionOp to sRGB if it is not yet in sRGB.
If you want to apply this fix alone, you would need to temporary disable the
test
{code:java}
PNGConverterTest.testImageConversionRGB16BitICC(){code}
The others should still work. Or your extend checkIdent() to correctly convert
non-sRGB BufferedImages to sRGB first. I can also provide a patch for that if
you like.
was (Author: rototor):
The bug in the PNGConverter is, that it did not correctly write the ICC
profile. It had a "one off" error, as it did not skip the 0-byte marker in the
profile name (first 0..79 bytes of the iCCP chunk + 0 byte). And it did not
mark the stream as FLATE_DECODE.
PDFBox (and likely all other PDF readers) just ignored the ICC profile because
of this (Exception while decoding the profile). But this meant that the colors
were not correct (as the wrong color profile was used; the alternative
DeviceRGB was used).
The minimal patch would be:
{code:java}
diff --git
a/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PNGConverter.java
b/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PNGConverter.java
index f17cdd7cd..866cfbfba 100644
---
a/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PNGConverter.java
+++
b/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PNGConverter.java
@@ -400,11 +400,15 @@ final class PNGConverter
if (state.iCCP != null || state.sRGB != null)
{
// We have got a color profile, which we must attach
cosStream.setInt(COSName.N, colorSpace.getNumberOfComponents());
cosStream.setItem(COSName.ALTERNATE,
colorSpace.getNumberOfComponents()
== 1 ? COSName.DEVICEGRAY : COSName.DEVICERGB);
if (state.iCCP != null)
{
+ cosStream.setItem(COSName.FILTER, COSName.FLATE_DECODE);
// We need to skip over the name
@@ -415,6 +419,7 @@ final class PNGConverter
break;
iccProfileDataStart++;
}
+ iccProfileDataStart++;
if (iccProfileDataStart >= state.iCCP.length)
{
LOG.error("Invalid iCCP chunk, to few bytes");
{code}
But this will cause test failures in the PNGConverterTest. As the image now has
the right colors, but
- the JDK does not respect the embedded color profile in PNG images. Without
the fix for this in PNGConverterTest the colors will be "miles" off when the
PNG for comparison using ImageIO.
- comparing sRGB images does not work, even after applying the fix for the,
was there are some color rounding differences (off by 1 on the first pixel, for
whatever reason, likely some different color conversion paths somewhere). There
is a massive difference between converting single pixel values between
colorspaces and converting a whole image at once (using ColorConversionOp). The
later one may choose slightly different colors depending on the rendering
intent and the colors in use in the image. The image from PDImage.getImage()
would have been ColorConversionOp-converted, but in checkIdent() using getRGB()
the image read with ImageIO would be "pixel by pixel" color converted. One
could fix this by first converting the expected image using ColorConversionOp
to sRGB if it is not yet in sRGB.
If you want to apply this fix alone, you would need to temporary disable the
test
{code:java}
PNGConverterTest.testImageConversionRGB16BitICC(){code}
The others should still work. Or your extend checkIdent() to correctly convert
non-sRGB BufferedImages to sRGB first. I can also provide a patch for that if
you like.
> [PATCH] Allow to access raw image data and fix ICC profile embedding in
> PNGConverter
> ------------------------------------------------------------------------------------
>
> Key: PDFBOX-4847
> URL: https://issues.apache.org/jira/browse/PDFBOX-4847
> Project: PDFBox
> Issue Type: New Feature
> Components: PDModel, Writing
> Affects Versions: 2.0.19
> Reporter: Emmeran Seehuber
> Priority: Minor
> Labels: feature, patch
> Attachments: color_difference.png, pdfbox-rawimages.patch
>
>
> This patch was primary thought to add access to raw image data (i.e. without
> any kind of color conversion/reduction). While implementing and testing it I
> also found a bug with ICC profile embedding in the PNGConverter.
> This patch does those things:
> - add a method getRawRaster() to PDImage. This allows to read the original
> raster data in 8 or 16 bit without any kind of color interpretation. The user
> must know what he wants to do with this himself (E.g. to access the raw data
> of DeviceN images).
> - add a method getRawImage(). Tries to return the raster obtained by
> getRawRaster() as a BufferedImage. This is only successful if there is a
> matching java ColorSpace for the colorspace of the image. I.e. only for
> ICCBased images. In theory this also should work for PDIndexed sRGB images.
> But I have to find a PDF with such an image first to test it.
> - add a -noColorConversion switch to the ExtractImage utility to extract
> images in their original colorspace. For CMYK images this only works when a
> TIFF encoder (e.g. from TwelveMonkeys) is in the class path.
> - add support to export PNGs with ICC profile data in ImageIOUtil.
> - fix a bug in PNGConverter which does not correctly embed the ICC profile
> from the png file.
> - the PNGConverterTest tests the raw images; While reading PNG files to
> compare it also ensures that the embedded ICC profile is correctly respected.
> The default PNG reader at least till JDK11 does *not* respect the embedded
> ICC profile. I.e. the colors are wrong. But there is a workaround for this in
> the PNGConverterTest (which I have in production for years now). See the
> screenshot for the correct color display of the png_rgb_romm_16.png testfile
> (left side; macOS Preview app) and the wrong display (right side; Java;
> inside IDEA).
>
> Access to the raw image allows beside finding bugs like in the PNGConverter
> it also to do all kind of funny color things. E.g. a future patch could be to
> allow using the raw images to print PDFs. If the PDF you want to print has
> images with a gamut > sRGB (i.e. all modern cameras) and the target printer
> has also a gamut > sRGB (i.e. some ink photo printer) you will for sure see a
> difference in the resulting print. Such a mode would be rather slow, as the
> current sRGB image handling is optimized for speed and using the original raw
> images would need on demand color conversions in the printer driver. But you
> get „high quality“ out of it (at least in respect to colors).
> I don’t think this is in time for the 2.0.20 release.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]