Am 29.07.2015 um 03:34 schrieb 牛小伟:
can you give me the java code you process it successful? very thanks.

Hello 牛小伟,

I just processed your file with the ExtractText command utility.

But now I also tried some code, and this works:

PDDocument document = PDDocument.load(new File(pdfFilename), "");
            PDFTextStripper stripper = new PDFTextStripper();
            stripper.setSortByPosition(true);
            System.out.println(stripper.getText(document));


and here's the output I get:

29.07.2015 08:01:32.479 WARN [main] org.apache.pdfbox.pdmodel.font.FileSystemFontProvider:318 - New fonts found, font cache will be re-built 29.07.2015 08:01:32.485 WARN [main] org.apache.pdfbox.pdmodel.font.FileSystemFontProvider:223 - Building font cache, this may take a while 29.07.2015 08:01:33.125 WARN [main] org.apache.pdfbox.pdmodel.font.FileSystemFontProvider:470 - Missing 'name' entry for PostScript name in font C:\Windows\FONTS\Digit.TTF 29.07.2015 08:01:34.488 WARN [main] org.apache.pdfbox.pdmodel.font.FileSystemFontProvider:278 - Finished building font cache, found 404 fonts 29.07.2015 08:01:34.519 WARN [main] org.apache.pdfbox.pdmodel.font.PDCIDFontType0:141 - Using fallback ArialUnicodeMS for CID-keyed font HeiseiKakuGo-W5 現代・起亜自動車、ハイブリッド車の世界販売台数で3位に返り咲き―韓国メディ アItext!



If your code doesn't find the resource UniJIS-UCS2-HW-H, then there's something wrong with your build / your configuration. "UniJIS-UCS2-HW-H" is here:

org\apache\fontbox\cmap\UniJIS-UCS2-HW-H

open your jar file with Winzip or 7zip to look at it.

Btw, that directory has 97 entries.

Tilman

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to