John Am 19.03.2014 um 19:10 schrieb John Hewson <j...@jahewson.com>:
> Maruan, > >>>> From how I understand the rendering in PDF Form, Text, Image and Pattern >>>> maintain their own matrix to map to user space which is then transformed >>>> by the CTM to device space so handling them specifically is fine and >>>> inline with the spec. >>> >>> No, that’s not right, what I said was: >>> >>>>> My problem is that tiling patterns are defined in their parent stream’s >>>>> initial coordinate space, rather than the >>>>> coordinate space defined by the CTM. >>> >>> So patterns should *not* be using the CTM, which is what I’m trying to >>> achieve. >>> >> >> I think you misunderstood what I wrote - patterns have their own matrix - so >> I think we are on the same page here. IMHO according to the spec CTM >> transforms from user space to device space. So it’s pattern space -> user >> space -> device space. > > Nope, as I said, that’s what PDFBox currently does and it’s wrong. As you say > the CTM transforms from user space to device space, but it’s not the only way > to do so, and it is not used by patterns. As the processing is defined in the spec this is a good reference so no need to discuss that further. Of course different people might come to different conclusions by reading and interpreting the spec. > >> Didn’t mean to only reference to the spec but to use the same terms as >> described by the spec. Adding references to the spec is an add-on not a >> replacement. > > I don’t see what value this adds, given that the references will just go > out-of-date when the next spec is released. We already use the same > terminology as the PDF spec, so Ctrl+F can be used for quick look-ups that > won’t go out-of-date. You are not enforced to add the information. > >>> This isn’t possible, as I said it "will necessarily be a breaking change”. >>> This is because in 2.0 PDFStreamEngine needs to know the parent of each >>> stream, but processStream and processSubStream do not provide this >>> information. That’s why I’m discussing this on the mailing list. >> >> I don’t understand why this is shouldn’t be possible. It’s more effort, >> agreed, but beneficial. > > > What’s not to understand? PDFStreamEngine *needs* to know the parent of each > stream, and the old methods don’t provide this, passing a null parent will > not work because we need that information later in order to correctly process > the stream. If we allowed a null parent to be passed, the result would be > silently broken rendering - there’s no value in providing a > backwards-compatible API if it can only produce broken results. Won’t get to the same conclusion here (as I think we won’t get on the other topics above). > > -- John > > On 19 Mar 2014, at 10:31, Maruan Sahyoun <sahy...@fileaffairs.de> wrote: > >> John, >> >> Am 19.03.2014 um 18:15 schrieb John Hewson <j...@jahewson.com>: >> >>> Maruan >>> >>>> From how I understand the rendering in PDF Form, Text, Image and Pattern >>>> maintain their own matrix to map to user space which is then transformed >>>> by the CTM to device space so handling them specifically is fine and >>>> inline with the spec. >>> >>> No, that’s not right, what I said was: >>> >>>>> My problem is that tiling patterns are defined in their parent stream’s >>>>> initial coordinate space, rather than the >>>>> coordinate space defined by the CTM. >>> >>> So patterns should *not* be using the CTM, which is what I’m trying to >>> achieve. >>> >> >> I think you misunderstood what I wrote - patterns have their own matrix - so >> I think we are on the same page here. IMHO according to the spec CTM >> transforms from user space to device space. So it’s pattern space -> user >> space -> device space. >> >> >>>> I’d suggest that we make sure that the different ‚spaces‘ are defined >>>> properly within the code and refer to the PDF spec so that the code is >>>> easier to read if this is not already the case. With so many changes it’s >>>> a good opportunity to enhance the documentation within the source code. >>>> Some of the old code enjoys very little documentation. >>> >>> >>> I disagree, in general I don’t think that references to the PDF spec are a >>> good form of documentation (there are some exceptions). References to the >>> spec are meaningless to the reader unless they take the time to look them >>> up in a 700 page PDF document. I would argue that by just linking back to >>> the spec, we have *failed* to document PDFBox, not succeeded. >>> >>> References to the PDF spec have another major flaw: they go out-of-date. >>> For example a Pattern Colour Space will always be called “Pattern Colour >>> Space” in future versions of the PDF spec but it may not be described in >>> paragraph 8.6.6.2 or on page 156. The existing code contains many >>> references to the PDF 1.6 and 1.7 specs as well as the ISO PDF32000 spec, >>> which means that I need three 700 page PDF files open at all times in order >>> to look up PDFBox references. With the new version of the PDF spec due this >>> year, this situation is going to get worse. >>> >> >> Didn’t mean to only reference to the spec but to use the same terms as >> described by the spec. Adding references to the spec is an add-on not a >> replacement. >> >>> I agree that some of the existing code needs more documentation, and I >>> often add documentation to old files which I’m working on. However, my >>> approach is to just paste in a sentence or two from the PDF spec (fair >>> use). That way the reader does not ever need to look at the PDF spec. >>> Because we use the same terminology in PDFBox as in the spec, if someone >>> really wants to look something up, it’s as simple as Ctrl+F, no reference >>> needed, and it’s guaranteed not to go out-of-date. >>> >>>> I wouldn’t remove processStream and processSubStream but deprecate them >>>> and remove them in the next major release though as to keep the changes to >>>> a minimum. >>> >>> This isn’t possible, as I said it "will necessarily be a breaking change”. >>> This is because in 2.0 PDFStreamEngine needs to know the parent of each >>> stream, but processStream and processSubStream do not provide this >>> information. That’s why I’m discussing this on the mailing list. >> >> I don’t understand why this is shouldn’t be possible. It’s more effort, >> agreed, but beneficial. >> >>> >>>> For the rendering what might have been missed is taking the UserUnit entry >>>> in the page dictionary into account which might change the default user >>>> space. This was introduced in PDF 1.6. A good opportunity to read that >>>> entry and make sure that we handle it appropriately. >>> >>> Yes, I have this as a “todo” in my working copy, however, if we put the >>> UserUnit in the matrix then we should also put the page Rotation into the >>> matrix, but that’a a significant change. >>> >>> -- John >