Please do also post the full (for pdfbox / fontbox) stack trace. I have
a theory why it happens, which is that addTrueTypeCollection() does not
add the font as "*skipexception*" to the cache file because it's not
done in the exception handler.
Tilman
On 04.12.2023 21:17, Tilman Hausherr wrote:
Does the stack trace appear at every start? If yes then it's a bug.
The intent of the current code is that bad fonts aren't retried. The
font cache file should contain a line with "*skipexception*" for that
font. Can you look at it for the two font files?
I could change SHA512 to CRC32. It has the advantage that it won't
trigger people who heard about MD5 😂
I made a test and CRC32 is 20% faster.
Tilman
On 04.12.2023 18:48, Gili Tzabari wrote:
I think the commit contains a typo:
872
<https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l872>
private static String computeHash(byte[] ba)
873
<https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l873>
{
874
<https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l874>
MessageDigest md;
875
<https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l875>
try
876
<https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l876>
{
877
<https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l877>
md = MessageDigest.getInstance("SHA512");
878
<https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l878>
byte[] md5 = md.digest(ba);
879
<https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l879>
return Hex.getString(md5);
880
<https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l880>
}
881
<https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l881>
catch (NoSuchAlgorithmException ex)
882
<https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l882>
{
883
<https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l883>
// never happens
884
<https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l884>
return "";
885
<https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l885>
}
886
<https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/FileSystemFontProvider.java?revision=1912514&view=markup&pathrev=1912514#l886>
}
You shouldn't need to use SHA512 to detect changes by a non-malicious
actor. MD5 should be plenty, and even CRC32 would be enough. I
suggest downgrading the hash complexity.
Gili
On 2023-12-04 10:21, Kjetil Ødegaard wrote:
Hi,
I tried to upgrade an app to PDFBox 3.0.1 and I see a performance
issue.
It only affects the first PDF operation (after that it's quite
fast), but
it's a bit annoying since it takes about 20 seconds (on my M1 Macboox).
Profiling reveals that this Kotlin code triggers the delay:
val font = PDType1Font(Standard14Fonts.FontName.COURIER)
The thread dump shows that almost all time is spent in this method:
org.apache.pdfbox.pdmodel.font.FileSystemFontProvider#computeHash
I assume that this is related to PDFBOX-5684.
Is this possible to work around? Or is it possible to fix?
BR Kjetil
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org