Hi fellows,

I’m struggling with a TEI/XML processing task in LMTX.
The goal is straightforward: load TEI XML into a buffer and intercept |<app>/<lem>/<rdg>| so that:

|1.) <lem>| appears inline in the verse, and

|2.) <rdg>| variants are printed separately (e.g. as a note or bracketed list).

This already works using indirect approaches (tabular extraction, pre-processed XML, separate Lua routing, etc.), but I am trying to do it directly via |\xmlprocessbuffer| + |\xmlsetsetup| on the native TEI source, and I am consistently failing.


     _Observed issue_ :The compilation breaks systematically with an
     error like: |! You can'tuse'theletter U+0041A' in horizontal mode|

which suggests that raw text is leaking into TeX at the wrong moment. Here below is the M(N)WE :


% mwe-tei-app-lem-rdg-canonical.tex
% ---Beginning of the MWE code


\setuppapersize[A5][A5]
\setupbodyfont[libertinus,11pt]

\startbuffer[tei]
<?xml version="1.0" encoding="UTF-8"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0";>
  <text>
    <body>
      <div type="poem">
        <head>Test 1 — interception of &lt;app&gt;</head>
        <lg type="stanza" xml:id="st1">
          <l n="1">
            Arma virumque
            <app>
              <lem wit="#A">cano</lem>
              <rdg wit="#B">canoe</rdg>
              <rdg wit="#C">cano</rdg>
            </app>
          </l>
          <l n="2">Troiae qui primus ab oris</l>
        </lg>
      </div>
    </body>
  </text>
</TEI>
\stopbuffer

\startxmlsetups xml:tei:base
  \xmlsetsetup{tei}
    {*:TEI|*:text|*:body|*:div|*:head|*:lg|*:l|*:app|*:lem|*:rdg}
    {xml:tei:*}
\stopxmlsetups

\xmlregisterdocumentsetup{tei}{xml:tei:base}

\startxmlsetups xml:tei:TEI
  \xmlflush{#1}
\stopxmlsetups

\startxmlsetups xml:tei:l
  \xmlflush{#1}\par
\stopxmlsetups

\startxmlsetups xml:tei:app
  \xmlfirst{#1}{/lem}
  \space[
    \xmlall{#1}{/rdg}{
      \xmlflush{##1}
      {\tfxx\space(\xmlatt{##1}{wit})}\space
    }
  ]
\stopxmlsetups

\startxmlsetups xml:tei:lem
  {\bf\xmlflush{#1}}
\stopxmlsetups

\startxmlsetups xml:tei:rdg
  {\it\xmlflush{#1}}{\tfxx\space(\xmlatt{#1}{wit})}
\stopxmlsetups

\starttext
  \xmlprocessbuffer{tei}{tei}{}
\stoptext
% ---end of MWE--

*Expected output *: Arma virumque cano [canoe (#B) cano (#C)]
Troiae qui primus ab oris

*Actual result* : "/! You can't use 'the letter U+0041 A' in horizontal mode/"


     *Suspected root cause :*Some interaction between:

1.

   TEI namespace matching (localname pattern |*:app|, |*:lem|, |*:rdg|),

2.

   setup registration via |\xmlregisterdocumentsetup|,

3.

   pipeline |\xmlprocessbuffer → xml:tei:*|, is leaking literal text
   into TeX where only commands/tokens are expected.

*My question is :* What is the recommended minimal pattern in LMTX/MkIV to:

1.

   load TEI XML from a buffer,

2.

   properly match namespaced elements such as |<app>/<lem>/<rdg>|, and

3.

   process them conditionally during output,

*without* external preprocessing or Lua scripting, and *without* converting the structure into a table?

Any authoritative example or pointer would be extremely helpful — even if the conclusion is that a different interception strategy is necessary.

Many thanks in advance,
JP




___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : [email protected] / 
https://mailman.ntg.nl/mailman3/lists/ntg-context.ntg.nl
webpage  : https://www.pragma-ade.nl / https://context.aanhet.net (mirror)
archive  : https://github.com/contextgarden/context
wiki     : https://wiki.contextgarden.net
___________________________________________________________________________________

Reply via email to