Can text be extracted without adding trailing space? *Text.txt* def hello_world(): print("Hello World!")
hello_world() *File ends line above with no CRLF* java -jar pdfbox-app-2.0.25.jar TextToPDF -standardFont Courier test.pdf test.txt java -jar pdfbox-app-2.0.25.jar ExtractText test.pdf test1.txt Output file has a space appended to each line and last line has CRLF appended. Using test1.txt as input gives matching output. Using Win10. java -jar pdfbox-app-2.0.25.jar WriteDecodedDoc test.pdf test-decoded.txt %PDF-1.4 %צה 1 0 obj << /Type /Catalog /Version /1.4 /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [3 0 R] /Count 1 >> endobj 3 0 obj << /Type /Page /MediaBox [0.0 0.0 612.0 792.0] /Parent 2 0 R /Contents 4 0 R /Resources 5 0 R >> endobj 4 0 obj << /Length 178 >> stream /F1 10 Tf BT 40 763.07751 Td 0 -11.0775 Td (def hello_world\(\): ) Tj 0 -11.0775 Td ( print\("Hello World!"\) ) Tj 0 -11.0775 Td ( ) Tj 0 -11.0775 Td (hello_world\(\) ) Tj ET endstream endobj 5 0 obj << /Font 6 0 R >> endobj 6 0 obj << /F1 7 0 R >> endobj 7 0 obj << /Type /Font /Subtype /Type1 /BaseFont /Courier /Encoding /WinAnsiEncoding >> endobj xref 0 8 0000000000 65535 f 0000000015 00000 n 0000000078 00000 n 0000000135 00000 n 0000000247 00000 n 0000000478 00000 n 0000000511 00000 n 0000000542 00000 n trailer << /Root 1 0 R /ID [<2B2F22A234DF5483D5614CAB282ED31B> <2B2F22A234DF5483D5614CAB282ED31B>] /Size 8 >> startxref 637 %%EOF