Sylvain Wallez wrote:
Go back to first post of this thread, where (last paragraph) I proposed something similar. The whole discussion is about how we could have a syntax which doesn't introduce such verbosity in the sitemap.
Verbosity is not necessarily a bad thing. If it were, would any of us be using XML? ;-)
Good, point. However, the only verbosity currently added by views is the "label" attribute. This proposal is about achieving the same low verbosity for views with binary content.
As I explained in several replies, there's no equivalence between a reader and generator able to parse a given binary format. There needs to be some kind of adaptation/extraction before feeding the view.
Yup.
And what you describe above as "a PDF reader, a Word reader, a Postscript reader, etc." are IMO nothing more than _generators_, just like the SWF and MIDI generators we already have.
The functionality for all readers would obviously be the same: move these bytes from here to there. But yes, the codified mapping I think is important.
Please read carefully : I wrote *generators* !! This isn't about moving bytes, but about producing an XML document.
Let's consider the MIDI example. Suppose we have a large collection of karaoke files (MIDI supports embedded text that can be played on screen while playing the music), and we want to index the text of these songs for easy retrieval (along with some other meta-data).
Here's a sitemap example, using the current syntax
<map:match pattern="*.mid"/>
<map:act type="catch-view" src="content">
<map:generate type="midi" src="{1}.mid"/>
<map:transform src="xmidi2xdoc.xsl" label="content-label"/>
<!-- should never come here -->
<map:serialize type="xml"/>
</map:match>
<map:read src="{1}.mid"/>
</map:match>
You're mixing the <map:act> with a </map:match>, but I get the idea.
Picky guy, eh ?
(the "content" view starts at the "content-label" label to clearly distinguish the two notions).
And the proposed shorter one :
<map:match pattern="*.mid">
<map:read src="{1}.mid" unless-label="content"/>
<map:generate type="midi" src="{1}.mid"/>
<map:transform src="xmidi2xdoc.xsl" label="content-label"/>
<!-- should never come here -->
<map:serialize type="xml"/>
</map:match>
This breaks current convention that either a reader or a generator/transformer/serializer can act in a pipeline. In the first example, if "content" isn't specified, the action returns null and the reader is invoked; As far as the pipeline logic is concerned, there is only the reader. Serializers are already known as universal exit points. To use the second, the convention must be broken and readers must become universal exit points.
Readers already are universal exit points : once you encounter a reader, sitemap processing is terminated. <map:read> and <map:serialize> are like a "return" statement in Java.
In other words,
<map:match pattern="*.mid"> <map:read src="{1}.mid"/> <!-- without the unless-label --> <map:generate type="midi" src="{1}.mid"/> <map:transform src="xmidi2xdoc.xsl" label="content-label"/> <!-- should never come here --> <map:serialize type="xml"/> </map:match>
must become valid for consistency. A reader becomes an exit point and the rest of a pipeline is, by default, ignored. Is this an intended consequence?
No consequence : this is how the sitemap works today, and the above is valid, even if we can consider that the sitemap engine should more strict and signal that there's some unreachable code.
To add more to the confusion, in both your and my example, we can even avoid writing the <map:serialize> statement. Since some additional filtering occurs beforehand (either through the action or through reader labels), this statement is never reached and is useless !
Sylvain
-- Sylvain Wallez Anyware Technologies http://www.apache.org/~sylvain http://www.anyware-tech.com { XML, Java, Cocoon, OpenSource }*{ Training, Consulting, Projects } Orixo, the opensource XML business alliance - http://www.orixo.com
