[
https://issues.apache.org/jira/browse/PDFBOX-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16475470#comment-16475470
]
Emmeran Seehuber commented on PDFBOX-4184:
------------------------------------------
Just got an idea in the shower ...
{code:java}
Benchmark (zipLevel) Mode Cnt Score
Error Units
LosslessFactoryBenchmark.predictor 3 thrpt 5 168.186 ±
1.884 ops/s
LosslessFactoryBenchmark.predictor 6 thrpt 5 109.865 ±
2.022 ops/s
LosslessFactoryBenchmark.predictor 9 thrpt 5 20.382 ±
0.432 ops/s
LosslessFactoryBenchmark.predictorBig 3 thrpt 5 2.617 ±
0.047 ops/s
LosslessFactoryBenchmark.predictorBig 6 thrpt 5 2.211 ±
0.029 ops/s
LosslessFactoryBenchmark.predictorBig 9 thrpt 5 1.627 ±
0.039 ops/s
LosslessFactoryBenchmark.predictorBigBytes 3 thrpt 5 2.219 ±
0.055 ops/s
LosslessFactoryBenchmark.predictorBigBytes 6 thrpt 5 1.880 ±
0.057 ops/s
LosslessFactoryBenchmark.predictorBigBytes 9 thrpt 5 1.454 ±
0.025 ops/s
LosslessFactoryBenchmark.rgbOnly 3 thrpt 5 247.996 ±
7.758 ops/s
LosslessFactoryBenchmark.rgbOnly 6 thrpt 5 128.242 ±
3.246 ops/s
LosslessFactoryBenchmark.rgbOnly 9 thrpt 5 14.259 ±
0.339 ops/s
LosslessFactoryBenchmark.rgbOnlyBig 3 thrpt 5 8.113 ±
0.290 ops/s
LosslessFactoryBenchmark.rgbOnlyBig 6 thrpt 5 3.317 ±
0.059 ops/s
LosslessFactoryBenchmark.rgbOnlyBig 9 thrpt 5 1.308 ±
0.025 ops/s
LosslessFactoryBenchmark.rgbOnlyBigBytes 3 thrpt 5 3.506 ±
0.066 ops/s
LosslessFactoryBenchmark.rgbOnlyBigBytes 6 thrpt 5 2.149 ±
0.070 ops/s
LosslessFactoryBenchmark.rgbOnlyBigBytes 9 thrpt 5 1.081 ±
0.019 ops/s
{code}
Now the predictor is always faster at zip level 9. It is still slower at the
other zip levels, but not that much.
[^lossless_predictor_based_imageencoding_v4.patch]
I would be fine with this, so no api change would be needed.
> [PATCH]: Support simple lossless compression of 16 bit RGB images
> -----------------------------------------------------------------
>
> Key: PDFBOX-4184
> URL: https://issues.apache.org/jira/browse/PDFBOX-4184
> Project: PDFBox
> Issue Type: Improvement
> Components: Writing
> Affects Versions: 2.0.9
> Reporter: Emmeran Seehuber
> Priority: Minor
> Fix For: 2.0.10, 3.0.0 PDFBox
>
> Attachments: LoadGovdocs.java,
> lossless_predictor_based_imageencoding.patch,
> lossless_predictor_based_imageencoding_v2.patch,
> lossless_predictor_based_imageencoding_v3.patch,
> lossless_predictor_based_imageencoding_v4.patch,
> pdfbox_support_16bit_image_write.patch, png16-arrow-bad-no-smask.pdf,
> png16-arrow-bad.pdf, png16-arrow-good-no-mask.pdf, png16-arrow-good.pdf
>
>
> The attached patch add support to write 16 bit per component images
> correctly. I've integrated a test for this here:
> [https://github.com/rototor/pdfbox-graphics2d/commit/8bf089cb74945bd4f0f15054754f51dd5b361fe9]
> It only supports 16-Bit TYPE_CUSTOM with DataType == USHORT images - but this
> is what you usually get when you read a 16 bit PNG file.
> This would also fix [https://github.com/danfickle/openhtmltopdf/issues/173].
> The patch is against 2.0.9, but should apply to 3.0.0 too.
> There is still some room for improvements when writing lossless images, as
> the images are currently not efficiently encoded. I.e. you could use PNG
> encodings to get a better compression. (By adding a COSName.DECODE_PARMS with
> a COSName.PREDICTOR == 15 and encoding the images as PNG). But this is
> something for a later patch. It would also need another API, as there is a
> tradeoff speed vs compression ratio.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]