John, Am 19.03.2014 um 18:15 schrieb John Hewson <j...@jahewson.com>:
> Maruan > >> From how I understand the rendering in PDF Form, Text, Image and Pattern >> maintain their own matrix to map to user space which is then transformed by >> the CTM to device space so handling them specifically is fine and inline >> with the spec. > > No, that’s not right, what I said was: > >>> My problem is that tiling patterns are defined in their parent stream’s >>> initial coordinate space, rather than the >>> coordinate space defined by the CTM. > > So patterns should *not* be using the CTM, which is what I’m trying to > achieve. > I think you misunderstood what I wrote - patterns have their own matrix - so I think we are on the same page here. IMHO according to the spec CTM transforms from user space to device space. So it’s pattern space -> user space -> device space. >> I’d suggest that we make sure that the different ‚spaces‘ are defined >> properly within the code and refer to the PDF spec so that the code is >> easier to read if this is not already the case. With so many changes it’s a >> good opportunity to enhance the documentation within the source code. Some >> of the old code enjoys very little documentation. > > > I disagree, in general I don’t think that references to the PDF spec are a > good form of documentation (there are some exceptions). References to the > spec are meaningless to the reader unless they take the time to look them up > in a 700 page PDF document. I would argue that by just linking back to the > spec, we have *failed* to document PDFBox, not succeeded. > > References to the PDF spec have another major flaw: they go out-of-date. For > example a Pattern Colour Space will always be called “Pattern Colour Space” > in future versions of the PDF spec but it may not be described in paragraph > 8.6.6.2 or on page 156. The existing code contains many references to the PDF > 1.6 and 1.7 specs as well as the ISO PDF32000 spec, which means that I need > three 700 page PDF files open at all times in order to look up PDFBox > references. With the new version of the PDF spec due this year, this > situation is going to get worse. > Didn’t mean to only reference to the spec but to use the same terms as described by the spec. Adding references to the spec is an add-on not a replacement. > I agree that some of the existing code needs more documentation, and I often > add documentation to old files which I’m working on. However, my approach is > to just paste in a sentence or two from the PDF spec (fair use). That way the > reader does not ever need to look at the PDF spec. Because we use the same > terminology in PDFBox as in the spec, if someone really wants to look > something up, it’s as simple as Ctrl+F, no reference needed, and it’s > guaranteed not to go out-of-date. > >> I wouldn’t remove processStream and processSubStream but deprecate them and >> remove them in the next major release though as to keep the changes to a >> minimum. > > This isn’t possible, as I said it "will necessarily be a breaking change”. > This is because in 2.0 PDFStreamEngine needs to know the parent of each > stream, but processStream and processSubStream do not provide this > information. That’s why I’m discussing this on the mailing list. I don’t understand why this is shouldn’t be possible. It’s more effort, agreed, but beneficial. > >> For the rendering what might have been missed is taking the UserUnit entry >> in the page dictionary into account which might change the default user >> space. This was introduced in PDF 1.6. A good opportunity to read that entry >> and make sure that we handle it appropriately. > > Yes, I have this as a “todo” in my working copy, however, if we put the > UserUnit in the matrix then we should also put the page Rotation into the > matrix, but that’a a significant change. > > -- John