Hello,

<08CF372D> is an hexadecimal string, which is basically a hex encoded
representation of a char/byte array. The exact encoding of this byte
array is specified in the F1 font /Encoding key. PDF standard has
optimizations to draw the glyphs representing the text as fast as
possibile. Because of this reason, the logical text often can't be
retrieved directly from from the TJ/Tj operators, and must be mapped
to Unicode code points by using the /ToUnicode map of the font. It's
also possible that the logical text can be reconstructed only by
geometrical considerations, such as finding chunks of the string in
the proximities and geometrically within the same line. It's a complex
task and PoDoFo doesn't expose a high level API to perform such text
extraction. Also the handling of the different predefined/custom
encodings that the PDF standard allows to use or define is incomplete
and sometimes buggy. A work is being done to expose a new API for text
extraction that is working quite well. The API is to be expected to be
introduced first in pdfmm (a fork of PoDoFo), with a proposed plan to
merge it back to PoDoFo together all the required enhancements to
handling of PDF encodings.

Regards,
Francesco

On Tue, 12 Apr 2022 at 12:35, Alex <iwifi...@163.com> wrote:
>
> Hi,
>
>     When I opened a pdf file using podofobrowser.exe,if a pdfobject has a 
> stream object,podofobrowser.exe will show the content of the stream as the 
> following:
>
>
> BT
>
> /F2 10.56 Tf
>
> 1 0 0 1 136.46 758.28 Tm
>
> 0 g
>
> 0 G
>
> [(pdf)] TJ
>
> ET
>
>
> BT
>
> /F1 10.56 Tf
>
> 1 0 0 1 154.94 758.28 Tm
>
> 0 g
>
> 0 G
>
> [<08CF372D>] TJ
>
> ET
>
>
>
> In the first BT object,I know easily the text string is “pdf “ by ([(pdf)] 
> TJ),but it is difficult to understand [<08CF372D>] TJ in the second BT 
> object,could someone tell me how to understand [<08CF372D>],what encode type 
> is this.
>
>
>
>                                                                               
>                      Thanks,
>
>                                                                               
>                         Alex
>
>
>
>
>
> _______________________________________________
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users


_______________________________________________
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users

Reply via email to