on 5/6/03 7:59 AM Arnaud Le Hors wrote: > To end, I'll add a note on the debate over the "infoset-messing validation". > Here again there is no universal solution.
True. > If all you are dealing with is > XML documents - this is text -, validation merely means checking that your > document validates against a set of constraints. You couldn't care less > about any infoset augmentations in an environment where everything is text. > On the other hand, if what you are dealing with is XML data - this is > objects serialized in XML -, validation is the way you reconcile serialized > objects with their real type. Infoset augmentations are then fundamental and > can hardly be considered as messy... Wait a second. I agree with this vision, but I wouldn't call it "infoset augmentation" but rather "infoset normalization". "augmenting" means to "add information". This is what I personally consider bad because it alters the infoset (read: "information set") of the XML stream. just like DOM Node.normalize() changes the tree but doesn't change the information included in the tree, your 'normalization' of types changes the tree but doesn't change the information it includes. Why am I so picky on this? consider caching: when the infoset is normalized, the information contained is not changed, just morphed. So, if an XML stream wasn't valid before validation, it's not valid after. but if it was valid before validation, it is valid after. This means: normalizing the infoset doesn't influence the ergodic period of that infoset. But if I have "infoset augmentation" (say, external entity evaluation), this cannot be considered true anymore. This is why DTD are and must be considered harmful in a heavily cache-based environment like Cocoon. yes, there are way to work around this, but none are as elegant as separating concerns between infoset-normalizing pipeline stages and infoset augmenting pipeline stages. > So, beware of over simplistic characterizations... :-) I hope this helps outlining my point on this. -- Stefano.
