Hi!

i need to extract text from pdf, so i try to use podofo.

1) i build podofo.dll on win32 xp - visual studio express 10
2) i made pdf file "helloworld.pdf" with application example for the PoDoFo
PDF library
3) that file as i can see - look good, so, i think dll was build correctly
4) i try with podofotxtextract tool extract text from "helloworld.pdf", but
in that function:

void TextExtractor::AddTextElement( double dCurPosX, double dCurPosY,
PdfFont* pCurFont, const PdfString & rString )
{
    if( !pCurFont )
    {
        fprintf( stderr, "WARNING: Found text but do not have a current
font: %s\n", rString.GetString() );
        return;
    }

    if( !pCurFont->GetEncoding() )
    {
        fprintf( stderr, "WARNING: Found text but do not have a current
encoding: %s\n", rString.GetString() );
        return;
    }

    // For now just write to console
PdfString unicode = pCurFont->GetEncoding()->ConvertToUnicode( rString,
pCurFont );
    const char * pszData = unicode.GetStringUtf8().c_str();
while( *pszData ) {
//printf("%02x", static_cast<unsigned char>(*pszData) );
++pszData;
}
printf("(%.3f,%.3f) %s \n", dCurPosX, dCurPosY,
unicode.GetStringUtf8().c_str() );
}

i get Unhandled exception and Access violation, becouse pszData  - wrong
pointer.

can you help me? how i can correctly extract text from pdf with podofo?

i attache my dll and pdf. thank you.
------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users

Reply via email to