On 09/12/2011 06:57, Craig Ringer wrote:
Hi all


Hi Craig,

With pdf-image, is there any way to coalesce or merge multiple different subsets of the same font into a single font subset with no duplicate glyphs? Eg 50 different "Helvetica (subset)" instances into a single font in the output document?

Background:

I've just got Jeremias's pdf-image extension integrated into my code. It worked perfectly and immediately with little effort, which was delightful. Thankyou *VERY* much Jeremias for publishing that, it's a fantastic tool and I'd love to see it in fop core.

I'm encountering an unexpected issue with it, though: the PDFs produced by fop are *huge*. Examination with Acrobat Pro suggests that 90% of the space is taken up by fonts. Looking at the font list, I see huge numbers of copies of "Helvetica (subset)", "Helvetica Black (subset)" etc. That makes sense, since all the input PDFs have fonts embedded, and many use the same fonts. However, I'm including up to 1000 PDFs in each output PDF so the size adds up to prohibitive levels.

We also have the same problem and have been trying to find a solution; There is a cache within the PDF plug-in, but as soon as you change the way it works, memory usage seems to balloon. We did manage to de-duplicate the fonts though. We're still investigating the memory issue. If we find a solution we will let you know.


I'm wondering if there's any way to tell the pdf-image extension to embed certain fonts fully from supplied font files and avoid copying the matching subsets over from the input PDFs. If there isn't anything like that, any idea how practical it'd be?

For that matter, is the idea of collecting up all the subsets of a font as each pdf-image is embedded, then merging them into a single new embedded subset at the end completely insane? Or is it potentially practical? For that matter just keeping track of which glyphs are defined in each subset and building a new subset from a master font file at the end that included all those glyphs would help a lot.

I'm *really* hoping to avoid having to keep on using EPS input and PostScript output to PDF via Distiller, so I'm willing to put some work into this.

An alternative that we are planning on using if the memory issues with the plug-in can't be solved is to generate the PDFs from FOP separate to the static PDFs that you are importing and then use PDFBox in a post process to join the PDFs together at the end. Not ideal but it works.

Thanks,

Chris


--
Craig Ringer

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org





---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org

Reply via email to