John,

Am 19.03.2014 um 18:15 schrieb John Hewson <j...@jahewson.com>:

> Maruan
> 
>> From how I understand the rendering in PDF Form, Text, Image and Pattern 
>> maintain their own matrix to map to user space which is then transformed by 
>> the CTM to device space so handling them specifically is fine and inline 
>> with the spec.
> 
> No, that’s not right, what I said was:
> 
>>> My problem is that tiling patterns are defined in their parent stream’s 
>>> initial coordinate space, rather than the
>>> coordinate space defined by the CTM.
> 
> So patterns should *not* be using the CTM, which is what I’m trying to 
> achieve.
> 

I think you misunderstood what I wrote - patterns have their own matrix - so I 
think we are on the same page here. IMHO according to the spec CTM transforms 
from user space to device space. So it’s pattern space -> user space -> device 
space.


>> I’d suggest that we make sure that the different ‚spaces‘ are defined 
>> properly within the code and refer to the PDF spec so that the code is 
>> easier to read if this is not already the case. With so many changes it’s a 
>> good opportunity to enhance the documentation within the source code. Some 
>> of the old code enjoys very little documentation.
> 
> 
> I disagree, in general I don’t think that references to the PDF spec are a 
> good form of documentation (there are some exceptions). References to the 
> spec are meaningless to the reader unless they take the time to look them up 
> in a 700 page PDF document. I would argue that by just linking back to the 
> spec, we have *failed* to document PDFBox, not succeeded.
> 
> References to the PDF spec have another major flaw: they go out-of-date. For 
> example a Pattern Colour Space will always be called “Pattern Colour Space” 
> in future versions of the PDF spec but it may not be described in paragraph 
> 8.6.6.2 or on page 156. The existing code contains many references to the PDF 
> 1.6 and 1.7 specs as well as the ISO PDF32000 spec, which means that I need 
> three 700 page PDF files open at all times in order to look up PDFBox 
> references. With the new version of the PDF spec due this year, this 
> situation is going to get worse.
> 

Didn’t mean to only reference to the spec but to use the same terms as 
described by the spec. Adding references to the spec is an add-on not a 
replacement.

> I agree that some of the existing code needs more documentation, and I often 
> add documentation to old files which I’m working on. However, my approach is 
> to just paste in a sentence or two from the PDF spec (fair use). That way the 
> reader does not ever need to look at the PDF spec. Because we use the same 
> terminology in PDFBox as in the spec, if someone really wants to look 
> something up, it’s as simple as Ctrl+F, no reference needed, and it’s 
> guaranteed not to go out-of-date.
> 
>> I wouldn’t remove processStream and processSubStream but deprecate them and 
>> remove them in the next major release though as to keep the changes to a 
>> minimum.
> 
> This isn’t possible, as I said it "will necessarily be a breaking change”. 
> This is because in 2.0 PDFStreamEngine needs to know the parent of each 
> stream, but processStream and processSubStream do not provide this 
> information. That’s why I’m discussing this on the mailing list.

I don’t understand why this is shouldn’t be possible. It’s more effort, agreed, 
but beneficial.

> 
>> For the rendering what might have been missed is taking the UserUnit entry 
>> in the page dictionary into account which might change the default user 
>> space. This was introduced in PDF 1.6. A good opportunity to read that entry 
>> and make sure that we handle it appropriately.
> 
> Yes, I have this as a “todo” in my working copy, however, if we put the 
> UserUnit in the matrix then we should also put the page Rotation into the 
> matrix, but that’a a significant change.
> 
> -- John

Reply via email to