thx. it can't display correctly after copied from pdf reader.
Few days ago, I thought if i get cidtounicode maps, i can extract any
words. Now i know a little about it. And pdf struct is so difficult to
understand.
One more question, how to write CLK text in .net with the two extra dlls?
can you give me some refers?
在 2013-7-31 下午11:08,"iText Info" <i...@1t3xt.info>写道:
> Op 31/07/2013 15:57, Ke Xu schreef:
> > How to use iTextAsian.dll and iTextAsianCmaps.dll when i try to
> > extract CJK text and write CJK text into PDF file?
>
> iTextAsian and iTextAsianCmaps contain metrics files necessary to
> *create* documents containing CJK fonts. You don't need them to extract
> CJK text. You didn't answer Paulo's question yet though: if you open the
> document in Adobe Reader, can you copy/paste the text correctly? If not,
> there's a clear indication that one of the following situations applies:
> (a) the text is present as vector shapes drawn using PDF syntax, NOT
> as real text using a font, or
> (b) the text is drawn using a font, but the font has a strange
> encoding that doesn't allow translation of the characters to Unicode.
> For instance: in a Type3 font, you can pick your characters at random
> and have them point to whatever glyph you want.
> (c) the text is drawn using a font that does allow extraction, but the
> PDF uses obsolete ways to mimic styles, such as: printing the same
> character multiple times to mimic a bold font.
>
> In these cases, there's no way you'll be able to extract the text
> correctly, not using iText, not using any other library.
>
> In these case, you need to OCR the document!
>
>
> ------------------------------------------------------------------------------
> Get your SQL database under version control now!
> Version control is standard for application code, but databases havent
> caught up. So what steps can you take to put your SQL databases under
> version control? Why should you start doing it? Read more to find out.
> http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk
> _______________________________________________
> iText-questions mailing list
> iText-questions@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/itext-questions
>
> iText(R) is a registered trademark of 1T3XT BVBA.
> Many questions posted to this list can (and will) be answered with a
> reference to the iText book: http://www.itextpdf.com/book/
> Please check the keywords list before you ask for examples:
> http://itextpdf.com/themes/keywords.php
>
------------------------------------------------------------------------------
Get your SQL database under version control now!
Version control is standard for application code, but databases havent
caught up. So what steps can you take to put your SQL databases under
version control? Why should you start doing it? Read more to find out.
http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions
iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples:
http://itextpdf.com/themes/keywords.php