[
https://issues.apache.org/jira/browse/PDFBOX-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16618743#comment-16618743
]
Emmeran Seehuber commented on PDFBOX-4184:
------------------------------------------
[~tilman] If you have a ICC profile on an image, which is not the builtin sRGB
profile, you need the ICC profile, otherwise you will just have plain wrong
colors. You should not look at (r,g,b) or (c,m,y,k) as concrete color values,
but rather as vectors within the color space. Without a profile describing the
vectorspace/colorspace you have no idea what real colors the vector values
result in. DeviceRGB is (on screen) often interpreted as sRGB. But what
DeviceCMYK means is really up to the concrete interpreting device. I.e. this
will look different on every printer (brightness, color, ...). So DeviceCMYK as
a colorspace for an image mostly means "random", if you are not explicit
targeting one specific printer.
The ICC profile describes how to transform the color-vector-data into other
colorspaces, e.g. into sRGB to view on the screen or the concrete ICC profile
of the printing device.
If you load images in java using ImageIO you usually (especially when using
twelve monkeys) get an sRGB image. So you would never hit this path. If you
want to load an image with the real color profile of the image you must pass a
special prepared (i.e. with the right profile) BufferedImage into ImageIO. So
you wont get an image with an color space different to sRGB by accident.
If you have a image with an ICC profile, you always want the in this colorspace
with the attached profile. As its already not so easy to get the image in
anything different than sRGB.
Regarding file size bloat: Yes, the ICC profile will sum up, especially if you
have more images. The correct solution would be a ICC_Profile <-> PDICCBased
cache in the document, so that the same profile does not get encoded twice.
Should I implement such a cache? In my application I manually deduplicate the
ICC profiles at the moment.
The attached patch [^fix_profile_use4.patch] fixes the test driver and also
specifies a "Alternate" colorspace for the profile, for all those devices which
can not handle ICC_Profile's. With the correct ICC_Profile specified now also
the "roundtrip" sRGB->ISO Coated->sRGB works correctly, so the image can be
compared with the original image.
> [PATCH]: Support simple lossless compression of 16 bit RGB images
> -----------------------------------------------------------------
>
> Key: PDFBOX-4184
> URL: https://issues.apache.org/jira/browse/PDFBOX-4184
> Project: PDFBox
> Issue Type: Improvement
> Components: Writing
> Affects Versions: 2.0.9
> Reporter: Emmeran Seehuber
> Priority: Minor
> Fix For: 2.0.12, 3.0.0 PDFBox
>
> Attachments: 16bit.png, LoadGovdocs.java, fix_profile_use.patch,
> fix_profile_use3.patch, fix_profile_use4.patch, images.zip,
> lossless_predictor_based_imageencoding.patch,
> lossless_predictor_based_imageencoding_v2.patch,
> lossless_predictor_based_imageencoding_v3.patch,
> lossless_predictor_based_imageencoding_v4.patch,
> lossless_predictor_based_imageencoding_v5.patch,
> lossless_predictor_based_imageencoding_v6.patch,
> pdfbox_support_16bit_image_write.patch, png16-arrow-bad-no-smask.pdf,
> png16-arrow-bad.pdf, png16-arrow-good-no-mask.pdf, png16-arrow-good.pdf,
> size_compare.txt
>
>
> The attached patch add support to write 16 bit per component images
> correctly. I've integrated a test for this here:
> [https://github.com/rototor/pdfbox-graphics2d/commit/8bf089cb74945bd4f0f15054754f51dd5b361fe9]
> It only supports 16-Bit TYPE_CUSTOM with DataType == USHORT images - but this
> is what you usually get when you read a 16 bit PNG file.
> This would also fix [https://github.com/danfickle/openhtmltopdf/issues/173].
> The patch is against 2.0.9, but should apply to 3.0.0 too.
> There is still some room for improvements when writing lossless images, as
> the images are currently not efficiently encoded. I.e. you could use PNG
> encodings to get a better compression. (By adding a COSName.DECODE_PARMS with
> a COSName.PREDICTOR == 15 and encoding the images as PNG). But this is
> something for a later patch. It would also need another API, as there is a
> tradeoff speed vs compression ratio.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]