PDFBox performance issue:  Encoding.java  getCharacter() method tweak
---------------------------------------------------------------------

                 Key: PDFBOX-603
                 URL: https://issues.apache.org/jira/browse/PDFBOX-603
             Project: PDFBox
          Issue Type: Improvement
          Components: Text extraction
    Affects Versions: 0.8.0-incubator
         Environment: All
            Reporter: Mel Martinez
         Attachments: Encoding.java

During parsing / text extraction the Encoding.getCharacter(COSName) method is 
invoked repeatedly.

It includes a string test that is performed up front but should only occur 
rarely.  The code should be restructured slightly to only perform that test 
later. I.E. it should succeed fast and fail slow.

I'll post an attachment that rewrites the method slightly.  The performance 
gains is fairly significant.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to