Hi fellows,
I’m struggling with a TEI/XML processing task in LMTX.
The goal is straightforward: load TEI XML into a buffer and intercept
|<app>/<lem>/<rdg>| so that:
|1.) <lem>| appears inline in the verse, and
|2.) <rdg>| variants are printed separately (e.g. as a note or bracketed
list).
This already works using indirect approaches (tabular extraction,
pre-processed XML, separate Lua routing, etc.), but I am trying to do it
directly via |\xmlprocessbuffer| + |\xmlsetsetup| on the native TEI
source, and I am consistently failing.
_Observed issue_ :The compilation breaks systematically with an
error like: |! You can'tuse'theletter U+0041A' in horizontal mode|
which suggests that raw text is leaking into TeX at the wrong moment.
Here below is the M(N)WE :
% mwe-tei-app-lem-rdg-canonical.tex
% ---Beginning of the MWE code
\setuppapersize[A5][A5]
\setupbodyfont[libertinus,11pt]
\startbuffer[tei]
<?xml version="1.0" encoding="UTF-8"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0">
<text>
<body>
<div type="poem">
<head>Test 1 — interception of <app></head>
<lg type="stanza" xml:id="st1">
<l n="1">
Arma virumque
<app>
<lem wit="#A">cano</lem>
<rdg wit="#B">canoe</rdg>
<rdg wit="#C">cano</rdg>
</app>
</l>
<l n="2">Troiae qui primus ab oris</l>
</lg>
</div>
</body>
</text>
</TEI>
\stopbuffer
\startxmlsetups xml:tei:base
\xmlsetsetup{tei}
{*:TEI|*:text|*:body|*:div|*:head|*:lg|*:l|*:app|*:lem|*:rdg}
{xml:tei:*}
\stopxmlsetups
\xmlregisterdocumentsetup{tei}{xml:tei:base}
\startxmlsetups xml:tei:TEI
\xmlflush{#1}
\stopxmlsetups
\startxmlsetups xml:tei:l
\xmlflush{#1}\par
\stopxmlsetups
\startxmlsetups xml:tei:app
\xmlfirst{#1}{/lem}
\space[
\xmlall{#1}{/rdg}{
\xmlflush{##1}
{\tfxx\space(\xmlatt{##1}{wit})}\space
}
]
\stopxmlsetups
\startxmlsetups xml:tei:lem
{\bf\xmlflush{#1}}
\stopxmlsetups
\startxmlsetups xml:tei:rdg
{\it\xmlflush{#1}}{\tfxx\space(\xmlatt{#1}{wit})}
\stopxmlsetups
\starttext
\xmlprocessbuffer{tei}{tei}{}
\stoptext
% ---end of MWE--
*Expected output *: Arma virumque cano [canoe (#B) cano (#C)]
Troiae qui primus ab oris
*Actual result* : "/! You can't use 'the letter U+0041 A' in horizontal
mode/"
*Suspected root cause :*Some interaction between:
1.
TEI namespace matching (localname pattern |*:app|, |*:lem|, |*:rdg|),
2.
setup registration via |\xmlregisterdocumentsetup|,
3.
pipeline |\xmlprocessbuffer → xml:tei:*|, is leaking literal text
into TeX where only commands/tokens are expected.
*My question is :* What is the recommended minimal pattern in LMTX/MkIV to:
1.
load TEI XML from a buffer,
2.
properly match namespaced elements such as |<app>/<lem>/<rdg>|, and
3.
process them conditionally during output,
*without* external preprocessing or Lua scripting, and *without*
converting the structure into a table?
Any authoritative example or pointer would be extremely helpful — even
if the conclusion is that a different interception strategy is necessary.
Many thanks in advance,
JP
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the
Wiki!
maillist : [email protected] /
https://mailman.ntg.nl/mailman3/lists/ntg-context.ntg.nl
webpage : https://www.pragma-ade.nl / https://context.aanhet.net (mirror)
archive : https://github.com/contextgarden/context
wiki : https://wiki.contextgarden.net
___________________________________________________________________________________