Hello, <08CF372D> is an hexadecimal string, which is basically a hex encoded representation of a char/byte array. The exact encoding of this byte array is specified in the F1 font /Encoding key. PDF standard has optimizations to draw the glyphs representing the text as fast as possibile. Because of this reason, the logical text often can't be retrieved directly from from the TJ/Tj operators, and must be mapped to Unicode code points by using the /ToUnicode map of the font. It's also possible that the logical text can be reconstructed only by geometrical considerations, such as finding chunks of the string in the proximities and geometrically within the same line. It's a complex task and PoDoFo doesn't expose a high level API to perform such text extraction. Also the handling of the different predefined/custom encodings that the PDF standard allows to use or define is incomplete and sometimes buggy. A work is being done to expose a new API for text extraction that is working quite well. The API is to be expected to be introduced first in pdfmm (a fork of PoDoFo), with a proposed plan to merge it back to PoDoFo together all the required enhancements to handling of PDF encodings.
Regards, Francesco On Tue, 12 Apr 2022 at 12:35, Alex <iwifi...@163.com> wrote: > > Hi, > > When I opened a pdf file using podofobrowser.exe,if a pdfobject has a > stream object,podofobrowser.exe will show the content of the stream as the > following: > > > BT > > /F2 10.56 Tf > > 1 0 0 1 136.46 758.28 Tm > > 0 g > > 0 G > > [(pdf)] TJ > > ET > > > BT > > /F1 10.56 Tf > > 1 0 0 1 154.94 758.28 Tm > > 0 g > > 0 G > > [<08CF372D>] TJ > > ET > > > > In the first BT object,I know easily the text string is “pdf “ by ([(pdf)] > TJ),but it is difficult to understand [<08CF372D>] TJ in the second BT > object,could someone tell me how to understand [<08CF372D>],what encode type > is this. > > > > > Thanks, > > > Alex > > > > > > _______________________________________________ > Podofo-users mailing list > Podofo-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/podofo-users _______________________________________________ Podofo-users mailing list Podofo-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/podofo-users