[ 
https://issues.apache.org/jira/browse/PDFBOX-1893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-1893:
------------------------------------

    Attachment: jbig2test.pdf-1.png
                jbig2test.pdf

The attached jbig2 file is without the jbig2 encoded part.

> Refactor color spaces
> ---------------------
>
>                 Key: PDFBOX-1893
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1893
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Rendering
>    Affects Versions: 2.0.0
>            Reporter: John Hewson
>            Assignee: John Hewson
>              Labels: color
>             Fix For: 2.0.0
>
>         Attachments: jbig2test.pdf, jbig2test.pdf-1.png
>
>
> I'm currently working on this, so I wanted to open an issue to let everyone 
> know.
> Color spaces need to be refactored in 2.0.0. Tilman noticed slowness in 
> PDFBOX-1851 due to using ICC profiles and calling ColorSpace#toRGB for every 
> pixel. For example, the file from PDFBOX-1851 went from rendering in 4 
> seconds to taking over 60 seconds.
> The solution is to use ColorConvertOp to convert an entire BufferedImage in 
> one go, taking advantage of AWT's native color management module. Color 
> conversions done this way are almost instantaneous, even for large images.
> The current design of color spaces within PDFBox depends upon conversions 
> being done on a per-pixel basis, so a significant refactoring is needed in 
> order to convert images using ColorConvertOp without having to resort to 
> per-pixel calls in cases such as a Separation color space which uses a CMYK 
> alternate color space via a tint-transform.
> The color space handling code is also tightly coupled to image handling. The 
> various classes which read images each have their own color handling code 
> which rely on per-pixel conversions. For this reason any color space 
> refactoring must also included a significant refactoring of image handling 
> code. This is an opportunity to refactor all color handling so that it is 
> encapsulated within the color space classes, allowing downstream users to 
> call toRGB(float[]) or toRGB(BufferedImage) and not need to worry about tint 
> transforms and the like.
> ===========
> Here's a summary of the changes:
> - PDCcitt has been removed, its reading capability has moved to 
> CCITTFaxFilter and writing capability has moved to CCITTFactory.
> - PDJpeg has been removed. JPEG reading is now done by new code in DCTFilter 
> which correctly handles CMYK/YCCK color. This fixes various files where 
> images appeared like negatives. JPEG writing is done by new code in 
> JPEGFactory.
> - cleaned up JBIG2Filter
> - cleaned up JPXFilter, in particular calling decode() caused the stream 
> dictionary to be updated, which was unsafe. I've also added a special 
> JPXColorSpace which wraps the embedded AWT color space of a JPX 
> BufferedImage, this replaces the need for the awkward mapping of ColorSpace 
> to PDColorSpace.
> - Added better error messages for missing JAI plugins (JPX, JBIG2). A special 
> exception, MissingImageReaderException is now thrown.
> - PDXObjectForm has been renamed to PDFormXObject to match the PDF spec.
> - PDXObjectImage has been renamed in the same manner.
> - PDInlinedImage has been renamed to PDInlineImage for the same reason.
> - CCITTFaxDecodeFilter has been renamed to CCITTFaxFilter for consistency 
> with the other filters.
> - ImageParameters has been removed, it was used to represent inline image 
> parameters which are now simply members of PDInlineImage.
> - added PDColor which represents a color value, including patterns, it is 
> immutable for ease of use.
> - removed PDColorState which was a container for both a color and a color 
> space, in almost every case it was used to represent a color and so has been 
> replaced by PDColor and occasionally PDColorSpace.
> - moved most of the functionality of PDXObject into its subclasses
> - rewrote almost all color handling code in all PDColorSpace subclasses, 
> including fixing the calculations for l*a*b, DeviceN, and indexed color 
> spaces. 
> - all color spaces now implement a toRGB(float[]) function for color 
> conversion, so external consumers of color spaces no longer have to know 
> about internals such as tint transforms.
> - image color conversion is now performed in one operation, using 
> ColorConvertOp, rather than pixel-by-pixel, this speeds up ICC transforms by 
> many orders of magnitude. Color spaces now expose a special method 
> toImageRGB(Raster) for this purpose. This fixes some known performance issues 
> with certain files.
> - updated Type1, Axial, Radial, and Gouraud shading contexts to call the new 
> toRGB functions. This is an interim measure, for better performance the color 
> conversion should instead be done using toImageRGB after the entire gradient 
> is drawn to the raster.
> - creation of AWT Paint has been moved inside color spaces, hiding the 
> details from the caller. It is no longer possible to get an AWT Color from a 
> color space, only a Paint may be obtained.
> - removed PDColorSpaceFactory and moved its functionality into PDColorSpace.
> - moved some of the new shading and tiling pattern code to PDPattern so that 
> toPaint() is encapsulated in the color space.
> - new PDImage interface which is implemented by both PDInlineImage and 
> PDImageXObject
> - Image XObject image reading, masking  and stencilling code has been 
> rewritten, resulting in the removal of CompositeImage.
> - new SampledImageReader performs image reading for all formats, including 
> JPEG and CCITT. The format itself is simply a filter, as is the case in the 
> PDF spec. New image reading handles decode arrays, interpolation, and 
> conversion of all image types to efficient 8bpp rasters. This replaces 
> PDPixelMap as well as reading code from PDJpeg and PDCcitt. Handling of decod 
> arrays fixes various issues where images were inverted, especially inline 
> images in Type 3 fonts.
> - removed SetNonStrokingICCBasedColor, SetNonStrokingIndexed, 
> SetNonStrokingPattern, SetNonStrokingSeparation, SetStrokingICCBasedColor, 
> SetStrokingIndexed, SetStrokingPattern, SetStrokingSeparation, and replaced 
> them with SetColor.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to