Please do also post the full (for pdfbox / fontbox) stack trace. I have a theory why it happens, which is that addTrueTypeCollection() does not add the font as "*skipexception*" to the cache file because it's not done in the exception handler.

Tilman

On 04.12.2023 21:17, Tilman Hausherr wrote:
Does the stack trace appear at every start? If yes then it's a bug. The intent of the current code is that bad fonts aren't retried. The font cache file should contain a line with "*skipexception*" for that font. Can you look at it for the two font files?

I could change SHA512 to CRC32. It has the advantage that it won't trigger people who heard about MD5 😂

I made a test and CRC32 is 20% faster.

Tilman

On 04.12.2023 18:48, Gili Tzabari wrote:
I think the commit contains a typo:


872 <https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l872>     private static String computeHash(byte[] ba) 873 <https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l873>     { 874 <https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l874>     MessageDigest md; 875 <https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l875>     try 876 <https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l876>     { 877 <https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l877>     md = MessageDigest.getInstance("SHA512"); 878 <https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l878>     byte[] md5 = md.digest(ba); 879 <https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l879>     return Hex.getString(md5); 880 <https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l880>     } 881 <https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l881>     catch (NoSuchAlgorithmException ex) 882 <https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l882>     { 883 <https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l883>     // never happens 884 <https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l884>     return ""; 885 <https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l885>     } 886 <https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l886>     }

You shouldn't need to use SHA512 to detect changes by a non-malicious actor. MD5 should be plenty, and even CRC32 would be enough. I suggest downgrading the hash complexity.

Gili

On 2023-12-04 10:21, Kjetil Ødegaard wrote:
Hi,

I tried to upgrade an app to PDFBox 3.0.1 and I see a performance issue.

It only affects the first PDF operation (after that it's quite fast), but
it's a bit annoying since it takes about 20 seconds (on my M1 Macboox).

Profiling reveals that this Kotlin code triggers the delay:

     val font = PDType1Font(Standard14Fonts.FontName.COURIER)

The thread dump shows that almost all time is spent in this method:

org.apache.pdfbox.pdmodel.font.FileSystemFontProvider#computeHash

I assume that this is related to PDFBOX-5684.

Is this possible to work around? Or is it possible to fix?

BR Kjetil



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Reply via email to