Hello everyone,
I'm answering my own question, but perhaps my answer will enlighten
those who are searching in the darkness of XML nodes?.
I searched and found the reasons why it is difficult to produce a decent
TEI-XML MWE.
Following up on our earlier discussion, here is a short explanation of
why the original MWE I posted did not work as expected.
The core issue was the line:
|\xmlsetsetup{tei}
{*:TEI|*:text|*:body|*:div|*:head|*:lg|*:l|*:app|*:lem|*:rdg} {xml:tei:*} |
This mapping instructs ConTeXt to process all matching elements using
setups named |xml:tei:<local-name>|. However, I only defined setups for:
|xml:tei:TEI xml:tei:l xml:tei:app xml:tei:lem xml:tei:rdg |
No setups existed for:
|xml:tei:text xml:tei:body xml:tei:div xml:tei:head xml:tei:lg |
As a result, these elements were effectively “mute”: their contents were
not traversed, and the subtree never reached a complete processing path.
A second issue came from using absolute XPath-style selectors:
|/xmlfirst{#1}{/lem} /xmlall{#1}{/rdg}{...} |
Because |/lem| and |/rdg| are interpreted as absolute paths from the
document root, they did not match anything under |<app>|. Changing them
to *relative paths* solves this:
|\xmlfirst{#1}{lem} \xmlall{#1}{rdg}{...} |
Finally, adding a generic fallback for all mapped elements — e.g.:
|\startxmlsetupsxml:tei:* \xmlflush{#1} \stopxmlsetups|
— ensures that all targeted nodes are traversed even if they are not
individually specialized.
With these three changes:
1.
a fallback for |xml:tei:*|,
2.
relative paths for |<lem>| and |<rdg>|,
3.
and optional dedicated setups for |<div>|, |<head>|, |<lg>|, etc.,
the MWE becomes functional and robust, and additional styling can be
layered on top.
And now I get a PDF output that is almost completely satisfactory. Sorry
for the noise !
Best regards,
JP
Le 18/12/2025 à 01:02, Jean-Pierre Delange via ntg-context a écrit :
Hi fellows,
I’m struggling with a TEI/XML processing task in LMTX.
The goal is straightforward: load TEI XML into a buffer and intercept
|<app>/<lem>/<rdg>| so that:
|1.) <lem>| appears inline in the verse, and
|2.) <rdg>| variants are printed separately (e.g. as a note or
bracketed list).
This already works using indirect approaches (tabular extraction,
pre-processed XML, separate Lua routing, etc.), but I am trying to do
it directly via |\xmlprocessbuffer| + |\xmlsetsetup| on the native TEI
source, and I am consistently failing.
_Observed issue_ :The compilation breaks systematically with an
error like: |! You can'tuse'theletter U+0041A' in horizontal mode|
which suggests that raw text is leaking into TeX at the wrong moment.
Here below is the M(N)WE :
% mwe-tei-app-lem-rdg-canonical.tex
% ---Beginning of the MWE code
\setuppapersize[A5][A5]
\setupbodyfont[libertinus,11pt]
\startbuffer[tei]
<?xml version="1.0" encoding="UTF-8"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0">
<text>
<body>
<div type="poem">
<head>Test 1 — interception of <app></head>
<lg type="stanza" xml:id="st1">
<l n="1">
Arma virumque
<app>
<lem wit="#A">cano</lem>
<rdg wit="#B">canoe</rdg>
<rdg wit="#C">cano</rdg>
</app>
</l>
<l n="2">Troiae qui primus ab oris</l>
</lg>
</div>
</body>
</text>
</TEI>
\stopbuffer
\startxmlsetups xml:tei:base
\xmlsetsetup{tei}
{*:TEI|*:text|*:body|*:div|*:head|*:lg|*:l|*:app|*:lem|*:rdg}
{xml:tei:*}
\stopxmlsetups
\xmlregisterdocumentsetup{tei}{xml:tei:base}
\startxmlsetups xml:tei:TEI
\xmlflush{#1}
\stopxmlsetups
\startxmlsetups xml:tei:l
\xmlflush{#1}\par
\stopxmlsetups
\startxmlsetups xml:tei:app
\xmlfirst{#1}{/lem}
\space[
\xmlall{#1}{/rdg}{
\xmlflush{##1}
{\tfxx\space(\xmlatt{##1}{wit})}\space
}
]
\stopxmlsetups
\startxmlsetups xml:tei:lem
{\bf\xmlflush{#1}}
\stopxmlsetups
\startxmlsetups xml:tei:rdg
{\it\xmlflush{#1}}{\tfxx\space(\xmlatt{#1}{wit})}
\stopxmlsetups
\starttext
\xmlprocessbuffer{tei}{tei}{}
\stoptext
% ---end of MWE--
*Expected output *: Arma virumque cano [canoe (#B) cano (#C)]
Troiae qui primus ab oris
*Actual result* : "/! You can't use 'the letter U+0041 A' in
horizontal mode/"
*Suspected root cause :*Some interaction between:
1.
TEI namespace matching (localname pattern |*:app|, |*:lem|, |*:rdg|),
2.
setup registration via |\xmlregisterdocumentsetup|,
3.
pipeline |\xmlprocessbuffer → xml:tei:*|, is leaking literal text
into TeX where only commands/tokens are expected.
*My question is :* What is the recommended minimal pattern in
LMTX/MkIV to:
1.
load TEI XML from a buffer,
2.
properly match namespaced elements such as |<app>/<lem>/<rdg>|, and
3.
process them conditionally during output,
*without* external preprocessing or Lua scripting, and *without*
converting the structure into a table?
Any authoritative example or pointer would be extremely helpful — even
if the conclusion is that a different interception strategy is necessary.
Many thanks in advance,
JP
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the
Wiki!
maillist :[email protected]
/https://mailman.ntg.nl/mailman3/lists/ntg-context.ntg.nl
webpage :https://www.pragma-ade.nl /https://context.aanhet.net (mirror)
archive :https://github.com/contextgarden/context
wiki :https://wiki.contextgarden.net
___________________________________________________________________________________
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the
Wiki!
maillist : [email protected] /
https://mailman.ntg.nl/mailman3/lists/ntg-context.ntg.nl
webpage : https://www.pragma-ade.nl / https://context.aanhet.net (mirror)
archive : https://github.com/contextgarden/context
wiki : https://wiki.contextgarden.net
___________________________________________________________________________________