Is the Font embedded?
On Fri, Sep 6, 2013 at 2:30 AM, Filip Djumic <theprop...@gmail.com> wrote:
> Thanks for the suggestion.
> The problem is I can't get the stream of PdfFont (gotten by
> PdfDocument->GetFont() as in TextExtractor example) since GetStream doesn't
> return anything and PdfFont->GetObject()->HasStream() returns false..
> This seems to be the case with objects fetched by
> PdfPage->GetFromResources() as well.
> Is there something I'm missing here?
> I apologize if the solution is trivial, but I wasn't able to find it.
>
>
> On Thu, Sep 5, 2013 at 8:11 AM, Dominik Seichter <
> domseich...@googlemail.com> wrote:
>
>> See PdfFont::GetObject ()-> GetStream ()-> GetData () and similar methods
>>
>> Cheers
>> Am 05.09.2013 01:34 schrieb "Filip Djumic" <theprop...@gmail.com>:
>>
>>> I'm now trying to use FreeType API to access the font's cmap table. To
>>> create a font face, I either need a font file filename or a buffer
>>> containing the font data. How to get a hold of one of these?
>>> If I understood correctly, font file is embed in the pdf, but how get to
>>> it?
>>> In podofo, only PdfFontMetricsFreetype seems to create new font faces,
>>> and it does so by using the font filename or font data buffer that are
>>> passed as constructor arguments. Since I don't have the filename, I guess
>>> that I need an in-memory buffer of the font data, but
>>> PdfFontMetricsFreetype gets that in GetWin32Font function which seems to
>>> deal with windows known fonts only. My font name is something like "TT1.1"
>>> and BaseFont value is "KAIXMV+Calibri-Bold", so GetWin32Font doesn't return
>>> anything...
>>> Can anyone help me out with this, I'm completely stuck. I just can't
>>> figure out how to create a FreeType font face by using data from the pdf
>>> and podofo..
>>>
>>> Filip
>>>
>>>
>>> On Wed, Jul 31, 2013 at 7:25 PM, Filip Djumic <theprop...@gmail.com>wrote:
>>>
>>>> Thank you for your reply.
>>>>
>>>> If I understood correctly, I need to use the currently unused
>>>> FT_Library* parameter of the PdfFontFactory::CreateFont function to access
>>>> the FreeType api for that font.
>>>> FreeType api should then provide me with all the data needed to encode
>>>> and extract the text in this font in this case.
>>>> Is this a correct outline of how it should be done?
>>>>
>>>> F.
>>>>
>>>>
>>>> On Thu, Jul 18, 2013 at 3:08 AM, Leonard Rosenthol
>>>> <lrose...@adobe.com>wrote:
>>>>
>>>>> You need to dig into the font data/format itself. Since you have
>>>>> access to FreeType, you should be able to use it's public APIs to get what
>>>>> you need.
>>>>>
>>>>> Leonard
>>>>>
>>>>> From: Filip Djumic <theprop...@gmail.com>
>>>>> Date: Wednesday, July 17, 2013 9:02 PM
>>>>> To: "podofo-users@lists.sourceforge.net" <
>>>>> podofo-users@lists.sourceforge.net>
>>>>> Subject: [Podofo-users] Text extraction for TrueType fonts without
>>>>> encoding entry
>>>>>
>>>>> I'm trying to extract the plain text with podofo from a pdf thats
>>>>> using a TrueType font, I attached a sample document. The font dictionary
>>>>> has no encoding entry, here is an excerpt from Adobe's PDF ISO document
>>>>> about this case:
>>>>> *
>>>>> A TrueType font program’s built-in encoding maps directly from
>>>>> character codes to glyph descriptions by means
>>>>> of an internal data structure called a “cmap” (not to be confused with
>>>>> the CMap described in 9.7.5, "CMaps").
>>>>> ...
>>>>> A “cmap” table may contain one or more subtables that represent
>>>>> multiple encodings intended for use on
>>>>> different platforms (such as Mac OS and Windows). Each subtable shall
>>>>> be identified by the two numbers, such
>>>>> as (3, 1), that represent a combination of a platform ID and a
>>>>> platform-specific encoding ID, respectively. *
>>>>> ...
>>>>> *When the font has no Encoding entry, or the font descriptor’s
>>>>> Symbolic flag is set (in which case the Encoding
>>>>> entry is ignored), this shall occur:
>>>>> • If the font contains a (3, 0) subtable, the range of character codes
>>>>> shall be one of these: 0x0000 - 0x00FF,
>>>>> 0xF000 - 0xF0FF, 0xF100 - 0xF1FF, or 0xF200 - 0xF2FF. Depending on the
>>>>> range of codes, each byte
>>>>> from the string shall be prepended with the high byte of the range, to
>>>>> form a two-byte character, which shall
>>>>> be used to select the associated glyph description from the subtable.
>>>>> • Otherwise, if the font contains a (1, 0) subtable, single bytes from
>>>>> the string shall be used to look up the
>>>>> associated glyph descriptions from the subtable.*
>>>>>
>>>>> In PdfFontFactory::CreateFont method this case is not handled, since
>>>>> both font descriptor and encoding are required to create a TrueType font.
>>>>> I
>>>>> would like to try doing this myself but I'm not sure where to start..
>>>>> Obviously I need to get to the cmap table somehow first, but I have no
>>>>> idea
>>>>> how. In the attached pdf, each text block's font dictionary has these
>>>>> entries:
>>>>>
>>>>> BaseFont=KAIXMV+Calibri-Bold
>>>>> FirstChar=33
>>>>> FontDescriptor dictionary
>>>>> LastChar=59
>>>>> Subtype=TrueType
>>>>> ToUnicode dictionary
>>>>> Type=Font
>>>>> Widths array
>>>>>
>>>>> ToUnicode dictionary has these entries:
>>>>> Filter=FlateDecode
>>>>> Length reference
>>>>>
>>>>> Cmap doesn't seem to be there and PDF ISO doc doesn't provide any
>>>>> useful details.. Does anyone have any hints on this?
>>>>>
>>>>
>>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
>>> Discover the easy way to master current and previous Microsoft
>>> technologies
>>> and advance your career. Get an incredible 1,500+ hours of step-by-step
>>> tutorial videos with LearnDevNow. Subscribe today and save!
>>>
>>> http://pubads.g.doubleclick.net/gampad/clk?id=58041391&iu=/4140/ostg.clktrk
>>> _______________________________________________
>>> Podofo-users mailing list
>>> Podofo-users@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/podofo-users
>>>
>>>
>
------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58041391&iu=/4140/ostg.clktrk
_______________________________________________
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users