[ https://issues.apache.org/jira/browse/PDFBOX-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17871535#comment-17871535 ]
Tilman Hausherr commented on PDFBOX-5864: ----------------------------------------- These are vector graphics. See this output in PDFDebugger where the lines have no bounds. !screenshot-1.png! Looking into the page content stream, code like this can be seen: {noformat} q /GS13 gs 120 95.5 m 141 95.5 l S {noformat} m is move, l is line. > Implementing the write String method of the PDFTextCtripper class, unable to > get certain underscores > ---------------------------------------------------------------------------------------------------- > > Key: PDFBOX-5864 > URL: https://issues.apache.org/jira/browse/PDFBOX-5864 > Project: PDFBox > Issue Type: Bug > Components: Text extraction > Reporter: liu > Priority: Major > Attachments: 1.pdf, image-2024-08-07-11-20-35-180.png, > screenshot-1.png > > > [^1.pdf] > I want to parse and obtain the coordinates and positions of these > underscores, but I cannot obtain them using the above method. Is there any > way to obtain the coordinates and positions of these underscores. > !image-2024-08-07-11-20-35-180.png|width=471,height=216! > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org