[
https://issues.apache.org/jira/browse/PDFBOX-3432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Doswald updated PDFBOX-3432:
------------------------------------
Attachment: patch_for_CustomMap_VS_GSCollections_benchmark.patch
fontbox-benchmark-CustomMap-VS-GSCollections.zip
I've created a patch and a benchmark to compare the GS-Collections int map with
the custom int map I wrote. The GS-Collections code had to be stripped down
quite a bit. I've squashed the class hierarchy of IntIntHashMap and inlined
some static functions.
To my surprise, it seems that in this use-case, the GS-Collections are slower
than the custom IntIntMap. It could be because of the different ways one has to
iterate over the map entries (GS-Collections creates an IntIntPair object for
each mapping).
The performance comparison numbers:
Desktop
PdfBoxBenchmark.loadTTFFontCustomIntIntMap avgt 1438.574 ± 37.288 us/op
PdfBoxBenchmark.loadTTFFontGSIntIntMap avgt 1569.250 ± 34.920 us/op
Embedded
PdfBoxBenchmark.loadTTFFontCustomIntIntMap avgt 28274.989 ± 742.245 us/op
PdfBoxBenchmark.loadTTFFontGSIntIntMap avgt 36742.291 ± 919.297 us/op
Maybe I've made a mistake in the benchmark?
> Optimize CID to GlyphId mapping (TTF)
> -------------------------------------
>
> Key: PDFBOX-3432
> URL: https://issues.apache.org/jira/browse/PDFBOX-3432
> Project: PDFBox
> Issue Type: Improvement
> Components: FontBox
> Affects Versions: 2.0.1, 2.0.2, 2.0.3
> Environment: Ubuntu 14.04.4 LTS
> Reporter: Michael Doswald
> Priority: Trivial
> Labels: optimization, performance
> Fix For: 2.0.3, 2.1.0
>
> Attachments: PDFBOX-3432_Optimize_CID_to_GlyphId_mapping_rev1.patch,
> fontbox-benchmark-CustomMap-VS-GSCollections.zip,
> patch_for_CustomMap_VS_GSCollections_benchmark.patch,
> pdfbox-performance-PDFBOX-3432.zip
>
>
> TTF fonts map code-points (Code IDs) to glyphs. These are mappings from int
> to int. Because the JDK lacks map classes for primitive types, the code (e.g.
> in CmapSubtable) currently uses Map<Integer,Integer> for those mappings. This
> is inefficient in different ways:
> * Autoboxing/unboxing introduces a performance penalty
> * Boxing to Integer objects has a memory overhead
> * The JDK Map implementation has a big memory overhead for such simple objects
> For efficiency (execution time and memory consumption) I would propose to
> introduce a simple IntIntMap implementation which works with primitive
> integers.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]