Hello,

I'm a developer working on a project to convert PDF files to SVG, using pdfbox 
and batik.  Pdfbox already contains methods to draw the PDF onto a Graphics2D 
object, with the goal to export to JPG, GIF, etc.  However, when I supply a 
SVGGraphics2D object, I observe two problems with the text (probably due to the 
same issue):
- the text is distorted - badly positioned, wrong size, etc.
- certain characters don't appear.  For example, the euro symbol.  This 
particular character actually causes SVGGraphics2D::drawString to put an 
invalid character (0x02) in the XML.  The string passed to drawString contains 
a single byte, 0x02; however, in the PDF this character is mapped to a type1 
font and (I think) describes how to draw it.

Example code:
doc = PDDocument.load(url);DOMImplementation domImpl = 
GenericDOMImplementation.getDOMImplementation();Document document = 
domImpl.createDocument("http://www.w3.org/2000/svg";, "svg", null);SVGGraphics2D 
graphics = new SVGGraphics2D(document);....PageDrawer.drawPage(graphics, page, 
pageDimension);File outFile = new File("out.svg");Writer out = new 
OutputStreamWriter(new FileOutputStream(outFile), 
"UTF-8");graphics.stream(out);out.close();

I realize that the problem seems like it may be with pdfbox; however, the 
output is fine when exporting to, say, JPG (in which case the graphics object 
is a SunGraphics2D).  I looked at the source code but I'm afraid it's a bit 
over my head; the biggest thing I can see is that the algorithms are completely 
different. :)  The output is correct when I manually set 
generatorCtx.svgFont=true, but of course this makes the output file bigger 
(10MB instead of 8MB).
Any help on this issue would be greatly appreciated.  If needed, I can send a 
PDF to duplicate the problem.
Thank you for your time,

Kelsey Rider                                      

Reply via email to