So I've been thinking many Deep Thoughts lately about Pod. I have competing goals in the design of Pod as a document format:
The first and foremost goal is the absolute requirement that Pod be sufficient for easily writing text documentation, and that its semantics be simple enough for all its constructs to be easily translatable into any sane markup language or typesetting system. I think that with new perlpod and perlpodspec and my forthcoming/in-progress new Pod parser, that is pretty much taken care of. (Then there's just the cleanup work of actually bringing the formatters up to date, notably the currently appalling repulsive oozing hissing pod2html.) The second goal is that Pod be extensible enough that you could use it as a sort of "Huffman-coding for XML" (as I remember Larry once expressing the idea, altho I'm quoting from memory, as I can't now find the exact message). This idea isn't a real requirement, but hooboy, it'd be nice if I could pull it off. I'm thinking not of all XML, but just XML for document formatting. I feel like I've got most of the idea so far, but it's all quite tentative and hazy and incomplete. So I'm posting it here in the hopes someone might have a clever idea about it. The basic idea is that current Pod syntax is a notational variant of a subset of XML, and the job of a Pod parser such as I'm writing is to turnthe Pod syntax into either the equivalent XML, or to produce SAX events such as you'd get if you fed that XML into a SAX parser. So this means "B<foo>" turns into either "<B>foo</B>", or into the three events start_element() for "B", characters() for "foo", and end_element() for "B". So that gets you as far as my first goal, making a parser for just Pod as defined in the specs. (That's without getting into what "=item *" turns into, but that's just a detail.) However, I want to do the whole extensibility thing. My dream, in its grandest form, is that for whatever XML document format people are dealing with (say, DocBook), no-one will have to key in XML, and people could instead key it in with Pod syntax. So instead of: The <emphasis>destructor</emphasis> (<function>DESTROY</function>) for the object <literal>$b</literal> will be called... You would have something like: =equate M emphasis,B =equate U function,C =equate T literal,C and then anytime later... The M<destructor> (U<function>) for the object T<$b> will be called... The meaning of "=equate M emphasis,B" is "In the rest of this document, I may use a nonstandard formatting code 'M' as shorthand for 'emphasis' (that's an XML element name), but unless the Pod processor has told the Pod parser library that it understands 'emphasis', fall back on using 'B' instead". The end of that list has to be either one of the standard Pod formatting codes (B, C, F, etc.), or the specials "0" or "1" -- "0" meaning "ignore it and its content", and "1" meaning "just have its content, with no code around it". (Incidentally, I'm still not committed to that exact syntax -- among other things, I keep waffling between "=equate" and something like "=extend" or something klunkier like "=defcode".) That's all fine, and I think it addresses most of the problems that I think are out there. But there's some residual things I think: 0) It doesn't allow PIs (<?foo bar?>) or comments (<!-- foo -->) Yes, you can do =for whatever xml <?foo bar?> But you can't do it in the middle of a paragraph. Nor can you do arcana like entity declarations or anything. Hohum, I don't know that this is a big problem. 1) There's no provision for something like this: <some-block-level-thing> <some-other-block-level-thing> <para>I'm a cucumber, I'm a cucumber, I'm a cucumber, I'm a cucumber, Don't you take me to the pickle farm!</para> </some-other-block-level-thing> </some-block-level-thing> That is, =equate doesn't allow defining block-scope elements, since all Pod formatting codes (N<...>) are only within a paragraph (or heading, etc.). Maybe I could just say that if you want this, just use "=for whateverxml <some-block-level-thing><some-other-block-level-thing>", and then later "=for whateverxml </some-other-block-level-thing></some-block-level-thing>". But if there's some better way, I'd like to hear it. 2) In the example above, how to have paragraph-like things that aren't in a <para>...</para> but are in a <whoozits>...</whoozits>. Maybe this could just be done with something like "=equate Para =whoozits,Para", where Para means the invisible label you get on plain paragraphs? I'm less happy with this. 3) Attributes -- so far I have ways of getting <foo>...</foo> things. But no way of getting a bar="baz" in <foo bar="baz">...</foo>. Maybe I could do some hoodoo like: =extend K SuperMagicName,0 where anything aliased to SuperMagicName sets the attribute value(s) for the next element and then disappears, so K<class="thang" level="3">B<shazbot> emits <B class="thang" level="3">shazbot</B>. I'd like to find some really convincing way of saying that no, we don't need attributes. Some XML instances really need them, but maybe I get to say "if you need attributes, then you need to write yourself some layer of indirection between your Pod and the output XML, which inserts whatever attributes you need". But on the other hand, that layer of indirection would /possibly/ be something not vastly different from what I'm proposing with the SuperMagicName thing, so my not providing it might be a case of making everyone independently and badly invent the wheel that I'm refusing to provide. While I realize that these are all "problems" not with Pod, but with the attempt to allow use Pod as a shorthand for XML. But like I say, if there's some way to kill many birds with one stone (without requiring that stone be the size of Ireland, be in hyperspace, and/or be made of neutronium), it'd be nice to do, so that we could spread around the numminess of Pod! Thoughts, anyone? -- Sean M. Burke [EMAIL PROTECTED] http://www.spinn.net/~sburke/