[ 
https://issues.apache.org/jira/browse/PDFBOX-5049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17275562#comment-17275562
 ] 

Jairo Figueroa Jiménez commented on PDFBOX-5049:
------------------------------------------------

The problem is in the method:

        /**
     * Returns the width of the given Unicode string.
     *
     * @param text The text to get the width of.
     * @return The width of the string in 1/1000 units of text space.
     * @throws IOException If there is an error getting the width information.
     * @throws IllegalArgumentException if a character isn't supported by the 
font.
     */
    public float getStringWidth(String text) throws IOException
    {
        byte[] bytes = encode(text);
        ByteArrayInputStream in = new ByteArrayInputStream(bytes);
        
        float width = 0;
        while (in.available() > 0)
        {
            int code = readCode(in);
            width += getWidth(code);
        }
        
        return width;
    }

Apparently the 99760 bytes contained in the string treated in a polyform way 
are traversed since it
method is widely used in different methods and that makes the JVM practically 
run out of memory.

It would be necessary to study a mechanism to store the text of bytes in memory 
and not go through it so many
times.

> PlainText.Paragraph.getLines extremely slow on long lines
> ---------------------------------------------------------
>
>                 Key: PDFBOX-5049
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5049
>             Project: PDFBox
>          Issue Type: Bug
>          Components: AcroForm
>    Affects Versions: 2.0.21
>            Reporter: Tilman Hausherr
>            Priority: Major
>         Attachments: GHOSTSCRIPT-690526-0.pdf, GHOSTSCRIPT-692591-0.pdf, 
> GHOSTSCRIPT-692591-2.pdf
>
>
> The three attached files are very slow when constructing the appearance on 
> the field "gendate" (on the last page). That is a multiline field but with an 
> extremely long text.
> It happens at "// single word does not fit into width".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to