Shigeru Okada created PDFBOX-4934:
-------------------------------------
Summary: Could not find referenced cmap stream Adobe-Japan1-XXXX
Key: PDFBOX-4934
URL: https://issues.apache.org/jira/browse/PDFBOX-4934
Project: PDFBox
Issue Type: Bug
Components: FontBox
Affects Versions: 2.0.20
Environment: Windows10, 64bit
Reporter: Shigeru Okada
Attachments: JP.pdf, Korea.pdf
The IOException exception occurs when attached pdf feeded into PDFBox.
The attached pdf (JP.pdf) file include Adobe-Japan1-65534 cmap.
source code is as below.
---
import javax.imageio.ImageIO;
import org.apache.commons.io.FileUtils;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.rendering.ImageType;
import org.apache.pdfbox.rendering.PDFRenderer;
import org.apache.pdfbox.text.PDFTextStripper;
import org.apache.pdfbox.text.TextPosition;
public class pdfBoxTest {
public static void main(String[] args) throws Exception {
pdfBoxTest sample = new pdfBoxTest();
String pdfname = "D:/tmp/jp.pdf";
File pdf = FileUtils.getFile(pdfname);
sample.extractTextFromPDF(pdf);
sample.load(pdf);
}
public void load(File pdf) throws Exception {
PDDocument document = PDDocument.load(pdf);
PDFRenderer renderer = new PDFRenderer(document);
BufferedImage bufImage = renderer.renderImageWithDPI(0, 300,
ImageType.RGB);
ImageIO.write(bufImage, "jpg", new File("D:/tmp/jp.jpg"));
}
getExternalCMap mehod in CMapParse.class tries to find external CMap, but
it couldn't find Japan1-65534 and throws exception.
I know that there is no such a CMap, but it is no problem to open PDF file,
so I think it is better not to throw exception and use another CMap.
I modified source code as below temporarily. it works well.
protected InputStream getExternalCMap(String name) throws IOException {
InputStream is = this.getClass().getResourceAsStream(name);
if(is == null) {
if(name.startsWith("Adobe-Japan1")) {
name = "Adobe-Japan1-1";
} else if(name.startsWith("Adobe-Korea1")) {
name = "Adobe-Korea1-1";
}
is = this.getClass().getResourceAsStream(name);
if(is == null) {
throw new IOException("Error: Could not find referenced cmap
stream " + name);
}
}
return is;
}
But it is not essential one.
If possiblećI would like to ask you to modify source code not to throw
exception if
it cannot find Cmap.
I found another Korean pdf file, it inclues Adode-Korea1-3 Cmap.
<<
/Supplement 3
/Registry (Adobe)
/Ordering (Korea1)
>>
Please refer to attached file.
Thanks!
//Okada
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]