So, I have two questions/suggestions: 1) Wouldn't it be possible to let FOP create the output in two steps like
for instance (La)TeX does. Doing a dry run first only to calculate the page references, store them somewhere, and then produce the actual output
in a second run.
As outlined on this page this is the approach that we are heading for: http://xml.apache.org/fop/design/optimise.html
This should make any size document possible.
2) Are there plans to port FOP to C/C++ sometime? I guess that at least part of the memory consumption is to be blamed on Java and IIRC the underlying Xerces and Xalan are already available as C++ versions, so why not FOP?
The issue of implementing FOP is not about the language. Since using java means we already have a large number of services available and reduced debugging effort then this is a logical choice (this doesn't prevent other choices).
The issue is dealing with the large number of elements, properties and layout issues.
Once the real problem is solved then it will be a more relevant question.