Re: DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Tim Waugh

On Thu, Oct 10, 2002 at 05:43:38PM -0400, Daniel Veillard wrote:

  (1) You add support for ?if? and friends to xsltproc.  Probably the
  fastest route to a complete solution.
  
  (2) You tell me you'll take a patch from me to implement them.  I'd
  have to learn the xsltproc code, so it would take longer, but I 
  can do that.
 
   Honnestly 1/ and 2/ are not acceptable. Now if someone decides
 to standardize something like ?if? then it's a big mess.
 Moreover if this can be done by a small and fast external preprocessing,
 why try to put everything in the same tool ? 

Eric has written a tool, xmlif, to do this, and it's available in
xmlto-0.0.11pre6.  What's a 'big mess', incidentally?  Do 'if' et al
need to be namespaced, you mean?

Tim.
*/



msg06122/pgp0.pgp
Description: PGP signature


Re: DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Daniel Veillard

On Fri, Oct 11, 2002 at 08:52:01AM +0100, Tim Waugh wrote:
 On Thu, Oct 10, 2002 at 05:43:38PM -0400, Daniel Veillard wrote:
 
   (1) You add support for ?if? and friends to xsltproc.  Probably the
   fastest route to a complete solution.
   
   (2) You tell me you'll take a patch from me to implement them.  I'd
   have to learn the xsltproc code, so it would take longer, but I 
   can do that.
  
Honnestly 1/ and 2/ are not acceptable. Now if someone decides
  to standardize something like ?if? then it's a big mess.
  Moreover if this can be done by a small and fast external preprocessing,
  why try to put everything in the same tool ? 
 
 Eric has written a tool, xmlif, to do this, and it's available in
 xmlto-0.0.11pre6.  What's a 'big mess', incidentally?  Do 'if' et al
 need to be namespaced, you mean?

  Well something like this, basically I believe in standardization, don't
blame me on this but I would dislike if people were starting to rely on
xsltproc specific behaviour, if that behaviour isn't properly documented
and has a sufficiently large user base. Been there, even when conceptually
something looks like a very simple tool, it's only when you go through 
the steps of a large review that you discover the potentially fatal flaws
that can be associated with it. As a programmer I prefer to write and maintain
code associated to processing which went at least through the first steps
equivalent to a standardization track, coding is slow, maintainance is a
pain, life is short, enough reasons to be relatively careful about what
I decide to implement and maintain.

  Now to be relatively specific about ?if? as much as I can since I
don't have any clear picture of how the selection is actually done, it seems
to be in the line of the previously found standard extention abuses
like #pragma foobar for Winblows C compilers or various custom PI
that each SGML toolchains seems to have developped to tie in their 
customers in the 90's (I may get some heat for this, I don't care ;-)

  I would really prefer to get DocBook fixed to allow proper conditionalization
at the *markup* level (if the current solution is not sufficient for
users' needs like Eric), which then will be close to trivial to handle 
correctly in the existing XSLT tools, independantly of the toolchain used.

  In a nutshell, my guts feeling is that PIs + custom preprocessing
are the kind of patchy' mechanism which would be acceptable if the 
environment was a locked proprietary toolchain but don't feel right
in a DocBook framework where most of the processing is done otherwise
with standard tools.

  Now if a number of people did voice in saying that's the kind of processing
they really need, that there is a clean and public description with
review of the suggested extension, then I would certainly be an early
implementor of said feature.

Makes sense ?

Daniel

-- 
Daniel Veillard  | Red Hat Network https://rhn.redhat.com/
[EMAIL PROTECTED]  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



Re: DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Tim Waugh

On Fri, Oct 11, 2002 at 04:22:50AM -0400, Daniel Veillard wrote:

   Now if a number of people did voice in saying that's the kind of processing
 they really need, that there is a clean and public description with
 review of the suggested extension, then I would certainly be an early
 implementor of said feature.
 
 Makes sense ?

Yes, it does.

I don't know what Eric's complaint with the existing ![%cond;[..]]
mechanism is.  Eric?

Tim.
*/



msg06124/pgp0.pgp
Description: PGP signature


Re: DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Daniel Veillard

On Fri, Oct 11, 2002 at 04:41:41AM -0400, Eric S. Raymond wrote:
 Daniel Veillard [EMAIL PROTECTED]:
Now to be relatively specific about ?if? as much as I can since I
  don't have any clear picture of how the selection is actually done, it seems
  to be in the line of the previously found standard extention abuses
  like #pragma foobar for Winblows C compilers or various custom PI
  that each SGML toolchains seems to have developped to tie in their 
  customers in the 90's (I may get some heat for this, I don't care ;-)
 
 You know, there's reason people keep re-inventing mechanisms for this.
 It's because they need to get work done -- and getting work done often 
 means wanting to conditionalize documents without spending days on some
 elaborate custom XSLT hack.

  But then you put the burden on someone else. And an underspecified,
underreviewed mechanism is the hell of the maintainer's

  they really need, that there is a clean and public description with
  review of the suggested extension, then I would certainly be an early
  implementor of said feature.
 
Attribute/value  pairs  from  the  command  line  are  matched
against  the  attributes  associated  with  certain processing
instructions  in the document. The instructions are ?if? and
its inverse ?if not?, ?elif? and its inverse ?elif not?,
?else?, and ?fi?.
 
Argument/value  pairs  given  on  the command line are checked
against   the   value   of  corresponding  attributes  in  the
conditional  processing  instructions.  An  `attribute  match'
happens  if  an  attribute  occurs  in  both  the command-line
arguments  and  the  tag,  and the values match. An `attribute
mismatch'   happens   if  an  attribute  occurs  in  both  the
command-line  arguments  and  the  tag,  but the values do not
match.
 
Spans  between  ?if?  or  ?elif?  and the next conditional
processing  instruction  at  the same nesting level are passed
through unaltered if there is at least one attribute match and
no  attribute  mismatch;  spans  between ?if not? and ?elif
not?  and  the  next  conditional  processing instruction are
passed   otherwise.   Spans  between  ?else?  and  the  next
conditional-processing  tag  are  passed  through  only  if no
previous  span  at  the  same  level  has been passed through.
?if?  and  ?fi?  (and  their  `not'  variants)  change the
current nesting level; ?else? and ?elif? do not.
 
All  these  processing  instructions  will be removed from the
output  produced. Aside from the conditionalization, all other
input  is  passed  through  untouched;  in  particular, entity
references are not resolved.
 
Value  matching  is  by string equality, except that | in an
attribute  value  is  interpreted as an alternation character.
Thus,  saying  foo='red|blue'  on  the  command  line  enables
conditions  red  and blue. Saying color='black|white' in a tag
matches command-lineconditionscolor='black'and
color='white'.
 
Here is an example:
 Always issue this text.
 ?if condition='html'?
 Issue this text if 'condition=html' is given on the command line.
 ?elif condition='pdf|ps'?
 Issue this text if 'condition=pdf' or 'condition=ps'
 is given on the command line.
 ?else?
 Otherwise issue this text.
 ?fi?
 Always issue this text.

 Doesn't cover a hell of issues, 2mn read just pops up tons of unspecified
behaviour or serious problems. Heck, even the condition= syntax is only
given in the example 

- well formedness breakage, your description is done at the
  serialization level, it has 0 garantee on the level of XML well-formedness

  ?if condition='html'?
  foo
  ?elif condition='pdf|ps'?
  /foo
  ?fi?

  What gives ??? A further XML well formedness error ? In that case it's
  better left external. Otherwise you'd have to start to give a description
  in terms of the infoset, or similar like XInclude does.

- malformed preprocessor commands
?if?
?elif cond='pdf
|foo'?
  unlatched ?else? or ?fi?, etc, etc ...

  what is handled, and how ?

Daniel

-- 
Daniel Veillard  | Red Hat Network https://rhn.redhat.com/
[EMAIL PROTECTED]  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



Re: DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Eric S. Raymond

Daniel Veillard [EMAIL PROTECTED]:
   You want to do 
 XML  process X  xmlsubset  transform  web or print
 
 process X standalone can't be done easilly with XSLT, yes.
 
 XML  process X + transform  web or print
 
 can be done with XSLT assuming the way you tags things integrate okay
 with the markup, which is *not* the case with PIs

I don't *want* to do anything but have a painless way to conditionalize
my dcuments.  I'm not attached to doing it with a preprocessor.
But without the right kind of markup support built into the XSLT engine,
that seems to be the least painful choice.

If I weren't fully aware of the problems with this approach, I would
not have raised the possibility that these PIs might belong at the
XSLT level.

 And a solution based on markup tags/attributes and not PIs is likely
 to be quite simpler to describe fully.

Yes, I have been thinking about this, One alternative would be the
interface Jirk Kosek's stylesheet supports -- in effect, any tag may
have a condition attribute; the tag and its contents disappear if that
attribute's value doesn't match a passed parameter.  This would have
the advantage that the conditionalized output is guaranteed to be
well-formed if the input was.

   Yes with respect to structure, error handling etc ... And get traction,
 checking that others are interested in it and have reviewed it.

The PI-based will get a real-world test from xmlto's users asfter the 0.11
relese.

   You claim to have the magic solution, I want to hear
 more voices before commiting on it, especially since I'm not personally
 convinced it's the right technical approach.

Magic, no.  Workable, yes.
-- 
a href=http://www.tuxedo.org/~esr/;Eric S. Raymond/a



Re: DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Eric S. Raymond

Daniel Veillard [EMAIL PROTECTED]:
   I'm not convinced that one need acces to the DocType to conditionalize
 *processing* . And I'm definitely convinced that it's useless to try to
 add support for an unstructured processing within an XML toolkit.

The problem with not being able to see the doctype is that that means
you cannot pass it through transparently.  XSLT cannot reproduce on its
output what it can't see on its input.  That's why I gave up on XSLT -- 
because *any* stylesheet-based approach to conditionalization will have
the same problem.  

xmlif my be inelegant from a pure XML point of view, but it is
certainly not useless.  I'm using it quite effectively every time I 
render my document -- in fact it gives me important capabilities I
would not otherwise have.  Perhaps you should reconsider your
definition of useless?

   I don't want to add an add-hoc unspecified unstructured processing
 to my toolkit and maintain it. Show me a more coherent request and
 I will re-evaluate it. Mind you I have work to get done too !

I don't know what you would consider a coherent request.  Do you want
a more formal specification of the behavior of the PIs?
-- 
a href=http://www.tuxedo.org/~esr/;Eric S. Raymond/a



Re: DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Eric S. Raymond

Daniel Veillard [EMAIL PROTECTED]:
   Now to be relatively specific about ?if? as much as I can since I
 don't have any clear picture of how the selection is actually done, it seems
 to be in the line of the previously found standard extention abuses
 like #pragma foobar for Winblows C compilers or various custom PI
 that each SGML toolchains seems to have developped to tie in their 
 customers in the 90's (I may get some heat for this, I don't care ;-)

You know, there's reason people keep re-inventing mechanisms for this.
It's because they need to get work done -- and getting work done often 
means wanting to conditionalize documents without spending days on some
elaborate custom XSLT hack.

 I would really prefer to get DocBook fixed to allow proper conditionalization
 at the *markup* level (if the current solution is not sufficient for
 users' needs like Eric), which then will be close to trivial to handle 
 correctly in the existing XSLT tools, independantly of the toolchain used.

Norm?  Are you listening?  Is this anything that the DocBook development
group is thinking about?

   Now if a number of people did voice in saying that's the kind of processing
 they really need, that there is a clean and public description with
 review of the suggested extension, then I would certainly be an early
 implementor of said feature.

   Attribute/value  pairs  from  the  command  line  are  matched
   against  the  attributes  associated  with  certain processing
   instructions  in the document. The instructions are ?if? and
   its inverse ?if not?, ?elif? and its inverse ?elif not?,
   ?else?, and ?fi?.

   Argument/value  pairs  given  on  the command line are checked
   against   the   value   of  corresponding  attributes  in  the
   conditional  processing  instructions.  An  `attribute  match'
   happens  if  an  attribute  occurs  in  both  the command-line
   arguments  and  the  tag,  and the values match. An `attribute
   mismatch'   happens   if  an  attribute  occurs  in  both  the
   command-line  arguments  and  the  tag,  but the values do not
   match.

   Spans  between  ?if?  or  ?elif?  and the next conditional
   processing  instruction  at  the same nesting level are passed
   through unaltered if there is at least one attribute match and
   no  attribute  mismatch;  spans  between ?if not? and ?elif
   not?  and  the  next  conditional  processing instruction are
   passed   otherwise.   Spans  between  ?else?  and  the  next
   conditional-processing  tag  are  passed  through  only  if no
   previous  span  at  the  same  level  has been passed through.
   ?if?  and  ?fi?  (and  their  `not'  variants)  change the
   current nesting level; ?else? and ?elif? do not.

   All  these  processing  instructions  will be removed from the
   output  produced. Aside from the conditionalization, all other
   input  is  passed  through  untouched;  in  particular, entity
   references are not resolved.

   Value  matching  is  by string equality, except that | in an
   attribute  value  is  interpreted as an alternation character.
   Thus,  saying  foo='red|blue'  on  the  command  line  enables
   conditions  red  and blue. Saying color='black|white' in a tag
   matches command-lineconditionscolor='black'and
   color='white'.

   Here is an example:
Always issue this text.
?if condition='html'?
Issue this text if 'condition=html' is given on the command line.
?elif condition='pdf|ps'?
Issue this text if 'condition=pdf' or 'condition=ps'
is given on the command line.
?else?
Otherwise issue this text.
?fi?
Always issue this text.

-- 
a href=http://www.tuxedo.org/~esr/;Eric S. Raymond/a



DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Eric S. Raymond

Daniel Veillard [EMAIL PROTECTED]:
  Probably 1.0.22 usually within one month.

Thanks.
 
   Honnestly 1/ and 2/ are not acceptable. Now if someone decides
 to standardize something like ?if? then it's a big mess.
 Moreover if this can be done by a small and fast external preprocessing,
 why try to put everything in the same tool ? 

Because the external tool can't chase conditionals in entity inclusions.
The conditionals would have to be interpreted at parsing time
for that to work.
 
  (3) You add --error-filename.  Has the advantage that it could be used
  with other preprocessors.
 
   Basically the problem is that the processor working on a processed
 file gives back the error in terms of the processed file filename and
 line number instead of the original one. I can hardly see how it could
 be otehrwise, the line number will be wrong anyway and even attempting
 to provide the initial filename is only a partial solution due to the
 entities support problem.

In previous mail, I wrote: Half the problem is already solved.  There
was a mis-bug in sgmlpre that it passes through newlines from sections
that are conditionalized out.  Thus the line numbers are correct.
Perhaps I didn't make clear that xmlpre inherited this feature.

What would you consider a complete solution to this problem?  I'm not
wedded to xmlif itself, I just need to get some work done that
requires being able to conditionalize stuff.  If you think there's a
better way to handle this, I'm open to it.

So far, Jirka's stylesheet solution fails because of deficiencies in XSLT 1.0,
and xmlif can't chase inclusions.  It sure looks to me like a complete
solution will have to be built into xsltproc.

   Who else is gonna use this switch ? I could try to hack this but this
 sounds partial solution and not of widespread use, right ?

Any other preprocessor would need it.

   Can't you just sed the output in case of errors ?
 s/^processed/orgiginal/
 of the stderr stream doesn't sound hard to setup ...

Fine if I'm processing errors in batch mode, yes.  But I mentioned Emacs
for a reason...
-- 
a href=http://www.tuxedo.org/~esr/;Eric S. Raymond/a



Re: DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Eric S. Raymond

Daniel Veillard [EMAIL PROTECTED]:
 On Fri, Oct 11, 2002 at 05:38:02AM -0400, Eric S. Raymond wrote:
  xmlif knows nothing about the XML structure of the document.  All it `sees'
  is the processing instructions what is otherwise, from its point of view,
  a featureless byte stream.
 
   Then there is no good reason to implement it in an XML toolkit, really.
 Using sed sounds the right tool for the job.

You seem to be missing the point, here.

I don't want to write ad-hoc sed scripts every time I want to conditionalize
a document, any more than I want to write a single-use XSLT hack every time I
want to conditionalize a document.  Such approaches could only appeal to 
someone who is in love with XSLT or sed.  I'm not.

I want a simple tool that I can re-use.  Without having to edit my 
documents every time I want to generate a different variant.  Without
constantly writing custom stylesheets.  I have work to get done!

I tried the politically correct XML-purist approach.  It sucked.
There are weaknesses in XSLT 1.0 that make it unsuitable for this job
-- specifically the fact that a stylesheet can't see the input
doctype.

So I've written a tool that throws away that whole level of structure and
gets the job done.  I'd sure like to develop a better solution, but you 
seem to be intent on denying there is a problem.
-- 
a href=http://www.tuxedo.org/~esr/;Eric S. Raymond/a



Re: DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Eric S. Raymond

Daniel Veillard [EMAIL PROTECTED]:
  You know, there's reason people keep re-inventing mechanisms for this.
  It's because they need to get work done -- and getting work done often 
  means wanting to conditionalize documents without spending days on some
  elaborate custom XSLT hack.
 
   But then you put the burden on someone else. And an underspecified,
 underreviewed mechanism is the hell of the maintainer's

Do we build tools to be worshiped or to be used?  It's our *job* to accept
that `burden', Daniel.

The conditionalization mechanisms now available are complex, weak, and
painful to use.  This needs to be fixed.  I have made a start at addressing
the problem.  I don't pretend to have a final solution, but at least I'm
doing something more effective than chanting the sacred litanies of XML 
theology and how XSLT will save us all, hallelujah!  

It would be nice if somebody else would notice that this is a real-world 
problem that affects real users.  I bumped my nose on it because I'm working
with a real document, the Jargon File.  You might have heard of it.

  Doesn't cover a hell of issues, 2mn read just pops up tons of unspecified
 behaviour or serious problems. Heck, even the condition= syntax is only
 given in the example 

condition could be any attribute.  The tool doesn't care.
 
 - well formedness breakage, your description is done at the
   serialization level, it has 0 garantee on the level of XML well-formedness

That's right.

   ?if condition='html'?
   foo
   ?elif condition='pdf|ps'?
   /foo
   ?fi?
 
   What gives ??? A further XML well formedness error ? In that case it's
   better left external. Otherwise you'd have to start to give a description
   in terms of the infoset, or similar like XInclude does.
 
 - malformed preprocessor commands
 ?if?
 ?elif cond='pdf
 |foo'?
   unlatched ?else? or ?fi?, etc, etc ...

If you do this, you lose :-).

   what is handled, and how ?

xmlif knows nothing about the XML structure of the document.  All it `sees'
is the processing instructions what is otherwise, from its point of view,
a featureless byte stream.
-- 
a href=http://www.tuxedo.org/~esr/;Eric S. Raymond/a



Re: DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Eric S. Raymond


--oyUTqETQ0mS9luUI
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Tim Waugh [EMAIL PROTECTED]:
 I don't know what Eric's complaint with the existing ![%cond;[..]]
 mechanism is.  Eric?

Huh?  I thought that feature was SGML only!

I looked in my nutshell XML book.  It says, indeed, that conditional
sections are not allowed in the internal subset.  I think I see a sort
of painful sideways way to get this effect by conditional inclusion of
entities, but -- bletch!  Barf!  It's ugly, ugly, ugly.  And involves
modifying the document itself every time you want to change the
conditionalization, which is unacceptable.
--=20
a href=3Dhttp://www.tuxedo.org/~esr/;Eric S. Raymond/a

--oyUTqETQ0mS9luUI
Content-Type: application/pgp-signature
Content-Disposition: inline

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE9ppDnrfUW04Qh8RwRAgMxAKC4ASal5p+DfqgjXKcVa/ujfOU3qQCfa+Vk
RyUWmAdQGYNAPxVRMNKgPrk=
=m979
-END PGP SIGNATURE-

--oyUTqETQ0mS9luUI--



DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Eric S. Raymond

Daniel Veillard [EMAIL PROTECTED]:
 On Thu, Oct 10, 2002 at 06:14:59PM -0400, Eric S. Raymond wrote:
  What would you consider a complete solution to this problem?  I'm not
  wedded to xmlif itself, I just need to get some work done that
  requires being able to conditionalize stuff.  If you think there's a
  better way to handle this, I'm open to it.
 
   XSLT. Your basic problem is to format to print and web version.

I already tried this route, unsing Jirka Kosek's profiling stylesheet.
It was because that was such a serious pain in the butt that I started
tool-building.

Let's do everything in XSLT makes a nice theory but it falls down 
hard in practice.  The first serious stylesheet hack I tried ran 
smack bang into an XSLT limit that shouldn't have been there.

   I don't use Emacs, but I don't see why you couldn't make it execute 
 a wrapper shell around the XSLt processor instead of running the processor
 directly.

xmlto provides such a wrapper shell.  I'm digging into my references an
shell to see if there is a way to redirect the standard error of a 
process without stepping on stdout.
-- 
a href=http://www.tuxedo.org/~esr/;Eric S. Raymond/a



Re: DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Dave Pawson

At 11:33 11/10/2002, Daniel Veillard wrote:

  I don't affirm or deny anything w.r.t. conditionalization needs. I'm
just stating my position as the guys who implement and maintain the 
friggin' code, okay !

Corr, he's a bad tempered old b isn't he :-)



  I'm not convinced that one need acces to the DocType to conditionalize
*processing* . 
  I don't want to add an add-hoc unspecified unstructured processing
to my toolkit and maintain it.

I think you're right btw Daniel.
  Specials? Bah humbug I think is the atypical reply.

Regards DaveP.





DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Norman Walsh

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

/ Daniel Veillard [EMAIL PROTECTED] was heard to say:
|   I would really prefer to get DocBook fixed to allow proper conditionalization
| at the *markup* level (if the current solution is not sufficient for
| users' needs like Eric), which then will be close to trivial to handle 
| correctly in the existing XSLT tools, independantly of the toolchain used.

Ahem. I think DocBook has all the markup necessary. I'll save my
summary of the issues until I've read the whole thread, but DocBook
ain't the problem here.

Be seeing you,
  norm

- -- 
Norman Walsh [EMAIL PROTECTED]  | How can there be laughter, how can
http://www.oasis-open.org/docbook/ | there be pleasure, when the world
Chair, DocBook Technical Committee | is burning?--The Dhammapada
   | (probably 3rd century BC)
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.7 http://mailcrypt.sourceforge.net/

iD8DBQE9pwPeOyltUcwYWjsRAttIAKCApHEQeySLi4BRHuD4KIe5C3bkYwCfeg1V
pprTimxcPTurcQZyLmL5HWc=
=bPwb
-END PGP SIGNATURE-



DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Norman Walsh

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

This is a hard problem. If it was an easy problem, we wouldn't have to
keep reinventing solutions for it.

Right up front I think you have to choose: are you going to process
XML or are you going to process a character stream. Both are useful
and both have historically been employed. The conditional section
feature of SGML is really the character stream approach and profiling
attributes are the XML approach.

If you choose a character stream, I think, you're really not
processing XML. There's no reason to say you have an XML tool and I
have to support Daniel's reluctance to incorporate it into libxml.

If you're going to go the XML route, there are definitely some
artifacts of XSLT processing that are inconvenient.

What I don't understand off the top of my head Eric, is why you
abandoned the XML approach when you abandoned XSLT.

I'd be much more interested in your tool if it was a WF XML processor
that conditionalized XML elements based on their attributes. In other words,
instead of processing:

| Preamble
| !-- this is the test on the manual page --
| Always issue this text.
| ?if condition='html'
| Issue this text if 'condition=html' is given on the command line.
| ?elif condition='pdf|ps'
| Issue this text if 'condition=pdf' or 'condition=ps'
| is given on the command line.
| ?else
| Otherwise issue this text.
| ?fi
| Always issue this text.

Why not process:

doc
titlePreamble/title
!-- this is the test on the manual page --
paraAlways issue this text.
phrase condition=htmlIssue this text if 'condition=html' is given on the
command line./phrase
phrase condition=pdf|psIssue this text if 'condition=pdf' or 'condition=ps'
is given on the command line./phrase
phrase condition=somethingelseOtherwise issue this text./phrase
/para
paraAlways issue this text./para
/doc

It's harder to write the else cases in this style, but I think a
little creativity in the syntax of the condition attributes might
alleviate some of those problems.

Be seeing you,
  norm

- -- 
Norman Walsh [EMAIL PROTECTED]  | The perfect man has no method; or
http://www.oasis-open.org/docbook/ | rather the best of methods, which
Chair, DocBook Technical Committee | is the method of
   | no-method.--Shih-T'ao
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.7 http://mailcrypt.sourceforge.net/

iD8DBQE9pwtgOyltUcwYWjsRAveOAJ9/HKNZgSkqK/qYYROczok8F5boGACfak2i
BRgMlgn5SHfDhX2/zMTA8ZE=
=aPaF
-END PGP SIGNATURE-



DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Norman Walsh

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

/ Eric S. Raymond [EMAIL PROTECTED] was heard to say:
| It's harder to write the else cases in this style, but I think a
| little creativity in the syntax of the condition attributes might
| alleviate some of those problems.
|
| Propose a syntax?  All the ones I thought up were too ugly to live.  If
| you can come up with anything better I might implement it.

I think I'm willing to live without else. If I want else, I think the
right answer is a special-purpose XML vocabulary:

  chapter
prof:choose
  prof:when condition=html
titleHTML Title/title
  /prof:when
  prof:when condition=print
titlePrint Title/title
  /prof:when
  prof:otherwise
titlePrint and HTML Title/title
  /prof:otherwise
/prof:choose


Where the profiling application always removes all prof: elements.

Be seeing you,
  norm

- -- 
Norman Walsh [EMAIL PROTECTED]  | Time wounds all heels.
http://www.oasis-open.org/docbook/ | 
Chair, DocBook Technical Committee |
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.7 http://mailcrypt.sourceforge.net/

iD8DBQE9pyJoOyltUcwYWjsRAujEAJwOzcDCBvA9lCGQrDi+Q5McjZELVACfaQhn
sE12FV/TkuXEf7vhMnHTO0A=
=YVPW
-END PGP SIGNATURE-



Re: DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Daniel Veillard
On Fri, Oct 11, 2002 at 05:38:02AM -0400, Eric S. Raymond wrote:
 xmlif knows nothing about the XML structure of the document.  All it `sees'
 is the processing instructions what is otherwise, from its point of view,
 a featureless byte stream.

  Then there is no good reason to implement it in an XML toolkit, really.
Using sed sounds the right tool for the job.

Daniel

-- 
Daniel Veillard  | Red Hat Network https://rhn.redhat.com/
[EMAIL PROTECTED]  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



Re: DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Daniel Veillard
On Fri, Oct 11, 2002 at 06:36:21AM -0400, Eric S. Raymond wrote:
 Daniel Veillard [EMAIL PROTECTED]:
I'm not convinced that one need acces to the DocType to conditionalize
  *processing* . And I'm definitely convinced that it's useless to try to
  add support for an unstructured processing within an XML toolkit.
 
 The problem with not being able to see the doctype is that that means
 you cannot pass it through transparently.  XSLT cannot reproduce on its
 output what it can't see on its input.  That's why I gave up on XSLT -- 
 because *any* stylesheet-based approach to conditionalization will have
 the same problem.  

  You want to do 
XML  process X  xmlsubset  transform  web or print

process X standalone can't be done easilly with XSLT, yes.

XML  process X + transform  web or print

can be done with XSLT assuming the way you tags things integrate okay
with the markup, which is *not* the case with PIs

 xmlif my be inelegant from a pure XML point of view, but it is
 certainly not useless.  I'm using it quite effectively every time I 
 render my document -- in fact it gives me important capabilities I
 would not otherwise have.  Perhaps you should reconsider your
 definition of useless?

  useless to try to add support for an unstructured processing
   within an XML toolkit

 yes I stand on this, not the proper place. You must understand that
at that level I have an in-memory tree and that your ?if? ?else? 
are just PIs scattered around the tree and if they are not properly
instanciated w.r.t the structure the whole processing fall over.
And a solution based on markup tags/attributes and not PIs is likely
to be quite simpler to describe fully.

I don't want to add an add-hoc unspecified unstructured processing
  to my toolkit and maintain it. Show me a more coherent request and
  I will re-evaluate it. Mind you I have work to get done too !
 
 I don't know what you would consider a coherent request.  Do you want
 a more formal specification of the behavior of the PIs?

  Yes with respect to structure, error handling etc ... And get traction,
checking that others are interested in it and have reviewed it.
This thread between you and me could go on for ages, it's not the point
I don't want to implement something which might be architecturally broken
or too limited to be widely useful. You may have 100 MBytes of markup
tagged this way, but this won't be the reason why this should be added
at the toolkit level. You claim to have the magic solution, I want to hear
more voices before commiting on it, especially since I'm not personally
convinced it's the right technical approach.

Daniel

-- 
Daniel Veillard  | Red Hat Network https://rhn.redhat.com/
[EMAIL PROTECTED]  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



Re: DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Daniel Veillard
On Fri, Oct 11, 2002 at 06:09:37AM -0400, Eric S. Raymond wrote:
 So I've written a tool that throws away that whole level of structure and
 gets the job done.  I'd sure like to develop a better solution, but you 
 seem to be intent on denying there is a problem.

  I don't affirm or deny anything w.r.t. conditionalization needs. I'm
just stating my position as the guys who implement and maintain the 
friggin' code, okay !
  I'm not convinced that one need acces to the DocType to conditionalize
*processing* . And I'm definitely convinced that it's useless to try to
add support for an unstructured processing within an XML toolkit.
  I don't want to add an add-hoc unspecified unstructured processing
to my toolkit and maintain it. Show me a more coherent request and
I will re-evaluate it. Mind you I have work to get done too !

Daniel

-- 
Daniel Veillard  | Red Hat Network https://rhn.redhat.com/
[EMAIL PROTECTED]  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Norman Walsh
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

/ Norman Walsh [EMAIL PROTECTED] was heard to say:
| 5. You know, I really want this at the URI level.
|
|!DOCTYPE book PUBLIC ... ...
|book
|  ...
|  xi:include 
|href=http://localhost/profile/path/to/document.xml?condition='html'/
|/book
|
| Now we're getting somewhere!

Yep. I hacked this together as a CGI script on my local web server.
You could navigate the QE II through that security hole, but it's
useful enough to demonstrate that it's what I really want. Now, with
WebDAV, I wonder...

Be seeing you,
  norm

- -- 
Norman Walsh [EMAIL PROTECTED]  | The belief in a supernatural
http://www.oasis-open.org/docbook/ | source of evil is not necessary;
Chair, DocBook Technical Committee | men alone are quite capable of
   | every wickedness.--Joseph Conrad
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.7 http://mailcrypt.sourceforge.net/

iD8DBQE9pzyAOyltUcwYWjsRAjw3AJ44OUUTnmCmBj8+Ujhz1qx1iAQ7/wCePhTn
hM4Gh9MIaM0158S1z2yO1Ao=
=QV8s
-END PGP SIGNATURE-



DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Eric S. Raymond
Norman Walsh [EMAIL PROTECTED]:
 1. Entities should be expanded. If users process
 
!DOCTYPE book PUBLIC ... ... [
!ENTITY chap1.xml SYSTEM chap1.xml
]
book
  ...
  chap1;
/book
 
They're going to expect the profiling to apply to the content of
chap1;, not just the wrapper script. That means this code needs to be
implemented as an XML process, not a character-stream process. And really,
I think it needs to be done at the parser level, not in something like flex.
Though I suppose you could implement a specialty XML parser with flex.

Right.  I know this.  This is why I suggested that the facility might belong in
the XSLT engine itself.
 
 2. The downside of expanding entities is that nbsp; is going to become #160;.
Is this really a problem? You're not going to edit the profiled content,
right?

Yes, it is really a problem.  Precisely because you cannot predict in advance
what kind of postprocessing the user will want to do.

For any filter to throw away semantic information that it doesn't
absolutely have to is bad design.  If your tools force you to do this,
your tools are broken.  XSLT is a broken tool for this purpose.

 3. OTOH, I really do want this to happen before validation. That way I can
  write
 
chapter
  title condition=printPrint Title/title
  title condition=onlineOnline Title/title
 
and have the right thing happen.

I do exactly this.  

title
?if condition='book'?
The New Hacker's Dictionary
?else?The
Jargon File
?fi? 
(version version;)
/title

This is one of the reasons I'm skeptical about a pure-XML approach.  If you're
going to have to process before validation, why care about the XNL 
structure at all?
-- 
a href=http://www.tuxedo.org/~esr/;Eric S. Raymond/a



msg06263/pgp0.pgp
Description: PGP signature


DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Norman Walsh
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

/ Eric S. Raymond [EMAIL PROTECTED] was heard to say:
| Norman Walsh [EMAIL PROTECTED]:
| I think the right answer is a specialized XML parser that performs a
| variant of the identity transformation. In fact, it does exactly what
| Jirka's profiling code does except that it has a funky serializer that
| outputs the !DOCTYPE declaration and the internal subset (or ideally
| only the necessary parts of it).
| 
| In fact, that's just what my code does, by way of an egregious hack.
|
| So, are you planning to publish and support this?

Unclear. I think I'll investigate mod_perl and see if I can tighten up
some of the security holes. I'm not planning to do anything with this
real soon. And it would be nice to have a solution that didn't require
a web server.

Daniel, is there any sort of hook in xsltproc that would allow this to
be grafted on? Maybe a catalog entry that sends the URI to a bit of
python for resolution? Hmm... :-)

Be seeing you,
  norm

- -- 
Norman Walsh [EMAIL PROTECTED]  | It is not failure of others to
http://www.oasis-open.org/docbook/ | appreciate your abilities that
Chair, DocBook Technical Committee | should trouble you, but rather
   | your failure to appreciate
   | theirs.--Confucius
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.7 http://mailcrypt.sourceforge.net/

iD8DBQE9p0O1OyltUcwYWjsRAjbMAJ9iT9atcSpc+c8vhmVzvmZEc5SnPgCgsBmW
RrLWKNAP3b3Mw3MiMynmTc8=
=gXYM
-END PGP SIGNATURE-



DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Eric S. Raymond
Norman Walsh [EMAIL PROTECTED]:
 I think the right answer is a specialized XML parser that performs a
 variant of the identity transformation. In fact, it does exactly what
 Jirka's profiling code does except that it has a funky serializer that
 outputs the !DOCTYPE declaration and the internal subset (or ideally
 only the necessary parts of it).
 
 In fact, that's just what my code does, by way of an egregious hack.

So, are you planning to publish and support this?
-- 
a href=http://www.tuxedo.org/~esr/;Eric S. Raymond/a



msg06265/pgp0.pgp
Description: PGP signature


DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Norman Walsh
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

/ Eric S. Raymond [EMAIL PROTECTED] was heard to say:
| Norman Walsh [EMAIL PROTECTED]:
| 1. Entities should be expanded. If users process
[...]
| Right.  I know this.  This is why I suggested that the facility might belong in
| the XSLT engine itself.

Ah, but now you're mixing two processes together that are in principle
distinct. I like the idea of the profiling processor, but I don't
think it belongs in XSLT, it's more general than that.

| 2. The downside of expanding entities is that nbsp; is going to become #160;.
|Is this really a problem? You're not going to edit the profiled content,
|right?
|
| Yes, it is really a problem.  Precisely because you cannot predict in advance
| what kind of postprocessing the user will want to do.
|
| For any filter to throw away semantic information that it doesn't
| absolutely have to is bad design.  If your tools force you to do this,
| your tools are broken.  XSLT is a broken tool for this purpose.

No, the culprit here is XML. Entities are defined in the surface
syntax. It's really hard to parse XML without expanding entities. (In
fact, it's impossible to do a validating parse without expanding
them).

The problem is that entities are used for four different purposes that
we'd like to be able to distinguish:

1. For including external parsed content.

   !ENTITY foo.xml SYSTEM foo.xml

2. For including internal parsed content (macro expansion)

   !ENTITY bar Bar

3. For referencing characters we can't type on our keyboards

   !ENTITY nbsp #160;

4. For referencing external unparsed content

   !ENTITY foo.png SYSTEM foo.png NDATA PNG

Unfortunately, 1-3 are all referenced the same way in content. (I
included 4 just just for completeness, it isn't relevant to what we're
discussing.)

There's little point in blaming XSLT for expanding nbsp; when we're
asking it to expand foo.xml; and bar;. (And because entities are a
syntactic device, it would be very hard to define a generally useful
XML data model for XSLT that left entities unexpanded.)

Be seeing you,
  norm

- -- 
Norman Walsh [EMAIL PROTECTED]  | On the other hand, you have
http://www.oasis-open.org/docbook/ | different fingers.
Chair, DocBook Technical Committee |
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.7 http://mailcrypt.sourceforge.net/

iD8DBQE9pzlnOyltUcwYWjsRAvZBAKCaRBpaNeeFqHoLOzvNptS6lHYfYgCdGP79
oqQ8QAchNKoR+30Bxct1PGw=
=2B+E
-END PGP SIGNATURE-



DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Norman Walsh
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

/ Eric S. Raymond [EMAIL PROTECTED] was heard to say:
| Norman Walsh [EMAIL PROTECTED]:
| I think I'm willing to live without else. If I want else, I think the
| right answer is a special-purpose XML vocabulary:
| 
|   chapter
| prof:choose
|   prof:when condition=html
| titleHTML Title/title
|   /prof:when
|   prof:when condition=print
| titlePrint Title/title
|   /prof:when
|   prof:otherwise
| titlePrint and HTML Title/title
|   /prof:otherwise
| /prof:choose
| 
| 
| Where the profiling application always removes all prof: elements.
|
| Now tell me how this differs, other than in surface syntax, from what I've
| already done?

Your model allows

chapter
  title
  ?if condition=html?HTML Title
  /title
  ?fi
  

That's valid when the PIs are left in, but results in a non-XML
document when profiled. My model forces the input to be well-formed
XML and guarantees that the result will be well-formed.

Generally speaking, PI milestones are a very fragile markup mechanism.

For the complex (and hopefully rare) if-then-else case, your markup is
less verbose. But I think the common case is just

  ?if condition=html?
  para.../para
  ?fi?

where the fragility of PIs really don't provide any overriding benefit.
Far better, IMHO, to say

  para condition=html.../para

Be seeing you,
  norm

- -- 
Norman Walsh [EMAIL PROTECTED]  | Clearness is so eminently one of
http://www.oasis-open.org/docbook/ | the characteristics of truth that
Chair, DocBook Technical Committee | often it even passes for truth
   | itself.--Joubert
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.7 http://mailcrypt.sourceforge.net/

iD8DBQE9pzrjOyltUcwYWjsRAnNHAJ9feR/fIRSBcOoUp1ybkKSw/vPS+QCfR0Ct
vgAzQnBS2zBoZUJGsKSPrdc=
=mni6
-END PGP SIGNATURE-



Re: DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Bob Stayton
On Fri, Oct 11, 2002 at 03:09:05PM -0400, Norman Walsh wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 Some more thoughts about this issue...
 
 1. Entities should be expanded. If users process
 
!DOCTYPE book PUBLIC ... ... [
!ENTITY chap1.xml SYSTEM chap1.xml
]
book
  ...
  chap1;
/book
 
They're going to expect the profiling to apply to the content of
chap1;, not just the wrapper script. That means this code needs to be
implemented as an XML process, not a character-stream process. And really,
I think it needs to be done at the parser level, not in something like flex.
Though I suppose you could implement a specialty XML parser with flex.
 
 2. The downside of expanding entities is that nbsp; is going to become #160;.
Is this really a problem? You're not going to edit the profiled content,
right?
 
 3. OTOH, I really do want this to happen before validation. That way I can write
 
chapter
  title condition=printPrint Title/title
  title condition=onlineOnline Title/title
 
and have the right thing happen.

On the third hand, you can't load such documents into
a validating editor.  You'd have to wait until you process
it with conditions resolved to find out if it is valid.  Or
you write a smart validating editor that understands your
conditional syntax, and lets you set the conditions that
apply for a given session.  Text outside the conditions
could be dimmed and excluded from validation.  Now *that*
is the way to write conditional documents.

What happened to the old method of using phrase:
chapter
  titlephrase condition=printPrint Title/phrase
 phrase condition=onlineOnline Title/phrase/title

which can be validated?
 
 4. That means that losing the !DOCTYPE declaration is unfortunate.
But that could be fixed, I think, with a specialty XML parser.

I'm not clear what this means.
 
 5. You know, I really want this at the URI level.
 
!DOCTYPE book PUBLIC ... ...
book
  ...
  xi:include 
href=http://localhost/profile/path/to/document.xml?condition='html'/
/book
 
 Now we're getting somewhere!

Would that be part of XPointer or XInclude?

-- 

Bob Stayton 400 Encinal Street
Publications Architect  Santa Cruz, CA  95060
Technical Publications  voice: (831) 427-7796
Caldera International, Inc. fax:   (831) 429-1887
email: [EMAIL PROTECTED]



DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Eric S. Raymond

--9Ek0hoCL9XbhcSqy
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Norman Walsh [EMAIL PROTECTED]:
 I think I'm willing to live without else. If I want else, I think the
 right answer is a special-purpose XML vocabulary:
=20
   chapter
 prof:choose
   prof:when condition=3Dhtml
 titleHTML Title/title
   /prof:when
   prof:when condition=3Dprint
 titlePrint Title/title
   /prof:when
   prof:otherwise
 titlePrint and HTML Title/title
   /prof:otherwise
 /prof:choose
 
=20
 Where the profiling application always removes all prof: elements.

Now tell me how this differs, other than in surface syntax, from what I've
already done?

Yes, I could have dressed up my conditionals this way.  Still could with=20
some trivial changes to my flex program.  I thought it would be more
honest to make them PIs.
--=20
a href=3Dhttp://www.tuxedo.org/~esr/;Eric S. Raymond/a

--9Ek0hoCL9XbhcSqy
Content-Type: application/pgp-signature
Content-Disposition: inline

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE9pyvPrfUW04Qh8RwRAnf4AKDmx7+dYoi1V5D5dJy/psjdOL16lACgmxGf
Dd6T+925/qrXt2buOz8+7AQ=
=jAvc
-END PGP SIGNATURE-

--9Ek0hoCL9XbhcSqy--



DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Eric S. Raymond

--vkogqOf2sHV7VnPd
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Norman Walsh [EMAIL PROTECTED]:
 What I don't understand off the top of my head Eric, is why you
 abandoned the XML approach when you abandoned XSLT.

Well...I could cite the 'else' problem you mention below, but the truth is
that when it became apparent that XSLT wouldn't cut it, I dusted off sgmlpre
because that seemed like the quickest way to get a working tool.  Took me
less than a day.
=20
 Why not process:
=20
 doc
 titlePreamble/title
 !-- this is the test on the manual page --
 paraAlways issue this text.
 phrase condition=3DhtmlIssue this text if 'condition=3Dhtml' is given=
 on the
 command line./phrase
 phrase condition=3Dpdf|psIssue this text if 'condition=3Dpdf' or 'con=
dition=3Dps'
 is given on the command line./phrase
 phrase condition=3DsomethingelseOtherwise issue this text./phrase
 /para
 paraAlways issue this text./para
 /doc

I've thought about this, actually.  Not so hard to implement with flex.
=20
 It's harder to write the else cases in this style, but I think a
 little creativity in the syntax of the condition attributes might
 alleviate some of those problems.

Propose a syntax?  All the ones I thought up were too ugly to live.  If
you can come up with anything better I might implement it.
--=20
a href=3Dhttp://www.tuxedo.org/~esr/;Eric S. Raymond/a

--vkogqOf2sHV7VnPd
Content-Type: application/pgp-signature
Content-Disposition: inline

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE9pxLPrfUW04Qh8RwRAtbXAJ9tvzX7pzwVjtpw0poX1j0F06tO3gCgwF+I
9RCT7jzBtoIZxgaQGy10bWw=
=L1qS
-END PGP SIGNATURE-

--vkogqOf2sHV7VnPd--



DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Norman Walsh
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

/ Eric S. Raymond [EMAIL PROTECTED] was heard to say:
| Norman Walsh [EMAIL PROTECTED]:
| That's valid when the PIs are left in, but results in a non-XML
| document when profiled. My model forces the input to be well-formed
| XML and guarantees that the result will be well-formed.
|
| Good answer.  Same one I anticipated several messages up-thgread :-)
|
| So, how would *you* implement this specialized vocabulary?  XSLT
| doesn't have the marbles for it.

I think the right answer is a specialized XML parser that performs a
variant of the identity transformation. In fact, it does exactly what
Jirka's profiling code does except that it has a funky serializer that
outputs the !DOCTYPE declaration and the internal subset (or ideally
only the necessary parts of it).

In fact, that's just what my code does, by way of an egregious hack.

#!/usr/bin/perl -w -- # -*- Perl -*-

use strict;
use XML::Parser::PerlSAX;

my $host = $ENV{'HTTP_HOST'} || ;
my $uri = $ENV{'REQUEST_URI'} || ;

if ($host ne 'localhost') {
forbidden();
}

my %profile = ();
my $xmlfile = ;
my $options = ;

if ($uri =~ /^.*?profile(\/.*?)\?(.*)$/) {
$xmlfile = $1;
$options = $2;
} elsif ($uri =~ /^.*?profile(\/.*)$/) {
$xmlfile = $1;
} else {
forbidden();
}

my @args = split(//, $options);
foreach $_ (@args) {
tr/+/ /;
s/%([a-fA-F0-9][a-fA-F0-9])/pack(C, hex($1))/eg;
}

foreach my $cond (@args) {
next if $cond !~ /^(\S+)=(\S+)/;
if (exists $profile{$1}) {
$profile{$1} .= |$2;
} else {
$profile{$1} = $2;
}
}

print Content-type: application/xml\n\n;

my $xmlDecl = ?xml version='1.0'?;
my $internalSubset = ;

if (open (F, $xmlfile)) {
# THIS IS A HACK
read (F, $_, 16384);

if (/^\s*(\\?xml.*?\?)/is) {
$xmlDecl = $1;
}

if (/!DOCTYPE\s/is) {
$_ = $ . $'; # '
if (/\]\/is) {
$_ = $` . $;
}
$internalSubset = $_;
}
close (F);
}

my $shandler = new SerializeHandler($xmlDecl, $internalSubset);
my $handler = new ProfileHandler($shandler, %profile);
my $parser = new XML::Parser::PerlSAX (Handler = $handler);

$parser-parse (Source = { 'SystemId' = $xmlfile });

close (STDOUT);
exit;

sub forbidden {
# FIXME: make this work on my server
#print HTTP/1.1 403 Forbidden\n;
#print Connection: close\n;
print Content-Type: text/html; charset=iso-8859-1\n\n;

print !DOCTYPE HTML PUBLIC \-//IETF//DTD HTML 2.0//EN\\n;
print HTMLHEAD\n;
print TITLE403 Forbidden/TITLE\n;
print /HEADBODY\n;
print H1Forbidden/H1\n;
print You don't have permission to access that resource\n;
print on this server.P\n;
print /BODY/HTML\n;
exit 0;
}

package ProfileHandler;

sub new {
my $type = shift;
my $chain = shift;
my %profile = @_;
my @stack = ();

my $self = { 'chain' = $chain,
 'profile' = \%profile,
 'stack' = \@stack };

return bless $self, $type;
}

sub start_document {
my $self = shift;

$self-{'chain'}-start_document() if $self-{'chain'};
$self-include();
}

sub start_element {
my $self = shift;
my $element = shift;

#print $element-{'Name'}, : , $self-context(), \n;

if ($self-context()) {
my %profile = %{$self-{'profile'}};
my %attrs = %{$element-{'Attributes'}};
my $match = 1;

foreach my $attr (keys %attrs) {
if (exists $profile{$attr}) {
my $value = $attrs{$attr};
my $prof = $profile{$attr};
$match = $match  $self-profileMatch($value,$prof);
}
}

if ($match) {
$self-include();
} else {
$self-ignore();
}
} else {
$self-ignore();
}

if ($self-context()) {
$self-{'chain'}-start_element($element) if $self-{'chain'};
}
}

sub end_element {
my $self = shift;

if ($self-context()) {
$self-{'chain'}-end_element(@_) if $self-{'chain'};
}
$self-pop();
}

sub characters {
my $self = shift;

if ($self-context()) {
$self-{'chain'}-characters(@_) if $self-{'chain'};
}
}

sub processing_instruction {
my $self = shift;
$self-{'chain'}-processing_instruction(@_) if $self-{'chain'};
}

sub comment {
my $self = shift;

if ($self-context()) {
$self-{'chain'}-comment(@_) if $self-{'chain'};
}
}

sub ignore {
my $self = shift;

#print IGNORE\n;
push (@{$self-{'stack'}}, 0);
}

sub include {
my $self = shift;

#print INCLUDE\n;
push (@{$self-{'stack'}}, 1);
}

sub pop {
my $self = shift;

#print POP\n;
pop (@{$self-{'stack'}});
}

sub context {
my $self = shift;

my @stack = @{$self-{'stack'}};
#print CONTEXT: , $stack[$#stack], \n;
return $stack[$#stack];
}

sub profileMatch {
my $self = shift;
my $values= shift;
my $profiles = shift;
my %profs = ();

 

DOCBOOK-APPS: Re: conditionalization of XML

2002-10-11 Thread Norman Walsh
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

/ Bob Stayton [EMAIL PROTECTED] was heard to say:
| 3. OTOH, I really do want this to happen before validation. That way I can write
| 
|chapter
|  title condition=printPrint Title/title
|  title condition=onlineOnline Title/title
| 
|and have the right thing happen.
|
| On the third hand, you can't load such documents into
| a validating editor.  You'd have to wait until you process
| it with conditions resolved to find out if it is valid.  Or
| you write a smart validating editor that understands your
| conditional syntax, and lets you set the conditions that
| apply for a given session.  Text outside the conditions
| could be dimmed and excluded from validation.  Now *that*
| is the way to write conditional documents.

That would be nice indeed.

| What happened to the old method of using phrase:
| chapter
|   titlephrase condition=printPrint Title/phrase
|  phrase condition=onlineOnline Title/phrase/title
|
| which can be validated?

Nothing. I still think that's a good idea. And really, the most common
things I've found to conditionalize are paragraphs and phrases so I
don't think the title case happens much more frequently than the
if/then/else case.

I was just thinking about the problem more generally. The requirement
seem to boil down to:

   Accept WF (possibly valid) XML, perform profiling, produce a result
   that contains enough information to be validated, if necessary.

| 4. That means that losing the !DOCTYPE declaration is unfortunate.
|But that could be fixed, I think, with a specialty XML parser.
|
| I'm not clear what this means.

If you use XSLT, you lose the !DOCTYPE and internal subset. That
means that you can't do down-stream validation because the information
that you need to do it has been lost.

| 5. You know, I really want this at the URI level.
| 
|!DOCTYPE book PUBLIC ... ...
|book
|  ...
|  xi:include 
|href=http://localhost/profile/path/to/document.xml?condition='html'/
|/book
| 
| Now we're getting somewhere!
|
| Would that be part of XPointer or XInclude?

I'm not sure. I'm thinking it's part of the URI.

Be seeing you,
  norm

- -- 
Norman Walsh [EMAIL PROTECTED]  | When the situation is desperate,
http://www.oasis-open.org/docbook/ | it is too late to be serious. Be
Chair, DocBook Technical Committee | playful.--Edward Abbey
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.7 http://mailcrypt.sourceforge.net/

iD8DBQE9pz+fOyltUcwYWjsRAufxAJ4ktGaE7PddPtlOVKtvbrB0c2e7/ACgi246
DlOGA0BUgM4s/1VuquOl2UY=
=nBmn
-END PGP SIGNATURE-



DOCBOOK-APPS: Re: conditionalization of XML

2002-10-10 Thread Daniel Veillard

On Thu, Oct 10, 2002 at 06:14:59PM -0400, Eric S. Raymond wrote:
 What would you consider a complete solution to this problem?  I'm not
 wedded to xmlif itself, I just need to get some work done that
 requires being able to conditionalize stuff.  If you think there's a
 better way to handle this, I'm open to it.

  XSLT. Your basic problem is to format to print and web version. I don't
see the core reason why this can't be possible with XSLT. You have troubles
but they are mostly related to the fact that you want to operate as a
preprocessor instead of symply relying on stylesheet customization, and/or
markup extensions, but I may be missing the point.
  Of course if you're using PIs to do this markup, it's by essence unstructured
and hence make it difficult to handle easilly at the XSLT level. Well
simply use markup extension to do this.

Can't you just sed the output in case of errors ?
  s/^processed/orgiginal/
  of the stderr stream doesn't sound hard to setup ...
 
 Fine if I'm processing errors in batch mode, yes.  But I mentioned Emacs
 for a reason...

  I don't use Emacs, but I don't see why you couldn't make it execute 
a wrapper shell around the XSLt processor instead of running the processor
directly.

Daniel

-- 
Daniel Veillard  | Red Hat Network https://rhn.redhat.com/
[EMAIL PROTECTED]  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/