Hello everyone,

I'm answering my own question, but perhaps my answer will enlighten those who are searching in the darkness of XML nodes?.

I searched and found the reasons why it is difficult to produce a decent TEI-XML MWE.

Following up on our earlier discussion, here is a short explanation of why the original MWE I posted did not work as expected.

The core issue was the line:

|\xmlsetsetup{tei} {*:TEI|*:text|*:body|*:div|*:head|*:lg|*:l|*:app|*:lem|*:rdg} {xml:tei:*} |

This mapping instructs ConTeXt to process all matching elements using setups named |xml:tei:<local-name>|. However, I only defined setups for:

|xml:tei:TEI xml:tei:l xml:tei:app xml:tei:lem xml:tei:rdg |

No setups existed for:

|xml:tei:text xml:tei:body xml:tei:div xml:tei:head xml:tei:lg |

As a result, these elements were effectively “mute”: their contents were not traversed, and the subtree never reached a complete processing path.

A second issue came from using absolute XPath-style selectors:

|/xmlfirst{#1}{/lem} /xmlall{#1}{/rdg}{...} |

Because |/lem| and |/rdg| are interpreted as absolute paths from the document root, they did not match anything under |<app>|. Changing them to *relative paths* solves this:

|\xmlfirst{#1}{lem} \xmlall{#1}{rdg}{...} |

Finally, adding a generic fallback for all mapped elements — e.g.:

|\startxmlsetupsxml:tei:* \xmlflush{#1} \stopxmlsetups|

— ensures that all targeted nodes are traversed even if they are not individually specialized.

With these three changes:

1.

   a fallback for |xml:tei:*|,

2.

   relative paths for |<lem>| and |<rdg>|,

3.

   and optional dedicated setups for |<div>|, |<head>|, |<lg>|, etc.,

the MWE becomes functional and robust, and additional styling can be layered on top.

And now I get a PDF output that is almost completely satisfactory. Sorry for the noise !

Best regards,
JP


Le 18/12/2025 à 01:02, Jean-Pierre Delange via ntg-context a écrit :

Hi fellows,

I’m struggling with a TEI/XML processing task in LMTX.
The goal is straightforward: load TEI XML into a buffer and intercept |<app>/<lem>/<rdg>| so that:

|1.) <lem>| appears inline in the verse, and

|2.) <rdg>| variants are printed separately (e.g. as a note or bracketed list).

This already works using indirect approaches (tabular extraction, pre-processed XML, separate Lua routing, etc.), but I am trying to do it directly via |\xmlprocessbuffer| + |\xmlsetsetup| on the native TEI source, and I am consistently failing.


      _Observed issue_ :The compilation breaks systematically with an
      error like: |! You can'tuse'theletter U+0041A' in horizontal mode|

which suggests that raw text is leaking into TeX at the wrong moment. Here below is the M(N)WE :


% mwe-tei-app-lem-rdg-canonical.tex
% ---Beginning of the MWE code


\setuppapersize[A5][A5]
\setupbodyfont[libertinus,11pt]

\startbuffer[tei]
<?xml version="1.0" encoding="UTF-8"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0";>
  <text>
    <body>
      <div type="poem">
        <head>Test 1 — interception of &lt;app&gt;</head>
        <lg type="stanza" xml:id="st1">
          <l n="1">
            Arma virumque
            <app>
              <lem wit="#A">cano</lem>
              <rdg wit="#B">canoe</rdg>
              <rdg wit="#C">cano</rdg>
            </app>
          </l>
          <l n="2">Troiae qui primus ab oris</l>
        </lg>
      </div>
    </body>
  </text>
</TEI>
\stopbuffer

\startxmlsetups xml:tei:base
  \xmlsetsetup{tei}
{*:TEI|*:text|*:body|*:div|*:head|*:lg|*:l|*:app|*:lem|*:rdg}
    {xml:tei:*}
\stopxmlsetups

\xmlregisterdocumentsetup{tei}{xml:tei:base}

\startxmlsetups xml:tei:TEI
  \xmlflush{#1}
\stopxmlsetups

\startxmlsetups xml:tei:l
  \xmlflush{#1}\par
\stopxmlsetups

\startxmlsetups xml:tei:app
  \xmlfirst{#1}{/lem}
  \space[
    \xmlall{#1}{/rdg}{
      \xmlflush{##1}
      {\tfxx\space(\xmlatt{##1}{wit})}\space
    }
  ]
\stopxmlsetups

\startxmlsetups xml:tei:lem
  {\bf\xmlflush{#1}}
\stopxmlsetups

\startxmlsetups xml:tei:rdg
  {\it\xmlflush{#1}}{\tfxx\space(\xmlatt{#1}{wit})}
\stopxmlsetups

\starttext
  \xmlprocessbuffer{tei}{tei}{}
\stoptext
% ---end of MWE--

*Expected output *: Arma virumque cano [canoe (#B) cano (#C)]
Troiae qui primus ab oris

*Actual result* : "/! You can't use 'the letter U+0041 A' in horizontal mode/"


      *Suspected root cause :*Some interaction between:

1.

    TEI namespace matching (localname pattern |*:app|, |*:lem|, |*:rdg|),

2.

    setup registration via |\xmlregisterdocumentsetup|,

3.

    pipeline |\xmlprocessbuffer → xml:tei:*|, is leaking literal text
    into TeX where only commands/tokens are expected.

*My question is :* What is the recommended minimal pattern in LMTX/MkIV to:

1.

    load TEI XML from a buffer,

2.

    properly match namespaced elements such as |<app>/<lem>/<rdg>|, and

3.

    process them conditionally during output,

*without* external preprocessing or Lua scripting, and *without* converting the structure into a table?

Any authoritative example or pointer would be extremely helpful — even if the conclusion is that a different interception strategy is necessary.

Many thanks in advance,
JP






___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist :[email protected] 
/https://mailman.ntg.nl/mailman3/lists/ntg-context.ntg.nl
webpage  :https://www.pragma-ade.nl /https://context.aanhet.net (mirror)
archive  :https://github.com/contextgarden/context
wiki     :https://wiki.contextgarden.net
___________________________________________________________________________________
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : [email protected] / 
https://mailman.ntg.nl/mailman3/lists/ntg-context.ntg.nl
webpage  : https://www.pragma-ade.nl / https://context.aanhet.net (mirror)
archive  : https://github.com/contextgarden/context
wiki     : https://wiki.contextgarden.net
___________________________________________________________________________________

Reply via email to