Dear List,
I am responding to myself for having found a solution. If there is a
better one, I would be happy to hear about it.
I finally understood what was wrong with the method I was using with Lua
code, which was supposed to allow me to create a double index (the
previous method was 'input.processors').
I realized that I had to use a “buffer” after defining an ‘autoindex’ in
Lua (without using the “input.processors” method) and also defining the
two registers that correspond to the two types of indexes (nominum,
rerum). The Lua code allows the items to be indexed to be extracted,
while the text is encapsulated by chapters and sections in a buffer.
Naturally, the two files ‘nomen’ and ‘rerum’ (which are *.txt files in
UTF-8) must be fed with the terms that we want to see in the indexes (I
am writing this for complete beginners). Here below is the MWE code.
Best//JP
\setuplayout[width=middle, backspace=20mm, topspace=15mm, height=middle]
\setupinteraction[state=start]
\mainlanguage[en]
% === Registers ===
\defineregister[nomen]
\defineregister[rerum]
\setupregister[nomen][indicator=yes,balance=yes,compress=yes,state=stop]
\setupregister[rerum][indicator=yes,balance=yes,compress=yes,state=stop]
% === Auto-index (Lua): buffer processing, WITHOUT input.processors ====
\startluacode
local function loadlist(name)
local t={}
local found=(resolvers and resolvers.findfile and
resolvers.findfile(name)) or name
local f=io.open(found,“r”); if not f then return t end
for line in f:lines() do
local s=line:gsub(“^%s+”,“”):gsub(“%s+$”,“”)
if s~=‘’ and s:sub(1,1)~=“#” then
local k,d=s:match(“^(.-)|(.+)$”); if not k then k,d=s,s end
t[#t+1]={key=k,disp=d}
end
end
f:close()
table.sort(t,function(a,b) return #a.disp>#b.disp end)
return t
end
local NOMEN=loadlist(“nomen-list.txt”)
local RERUM=loadlist(“rerum-list.txt”)
local function escpat(s)
local p=s:gsub(“([%%%^%$%*%(%)%+%-%[%]%?%.])”,“%%%1”)
p=p:gsub(“‘”, “[’’']”) -- ‘ or ’
p=p:gsub(“’”, “[’’']”)
p=p:gsub(“%%%-”,“[-–—]”) -- -, – or —
return p
end
% ===gsub function allows you to take into account the numerous
diacritical marks in French (apostrophes, accented vowels, etc.) ==
local function add_entries(str, list, reg)
for i=1,#list do
local key=list[i].key
local disp=list[i].disp
if disp and disp~=“” then
local pat=escpat(disp)
-- gsub function: we REGISTER the entry, we return the text unchanged
str = str:gsub(“(”. .pat..“)”, function(hit)
context.setregisterentry({reg}, { [‘keys’]=key, [“entries”]=disp })
return hit
end)
end
end
return str
end
userdata = userdata or {}
function userdata.typeset_with_autoindex (bufname)
local s = buffers.getcontent(bufname) or “”
s = add_entries(s, NOMEN, “nomen”)
s = add_entries(s, RERUM, “rerum”)
context(s) -- typeset the processed text, AT THE RIGHT TIME
end
\stopluacode
\starttext
\setupregister[nomen][state=start]
\setupregister[rerum][state=start]
% --- Example content (in a buffer) ---------------------------------------
\startbuffer[autotext]
… Après avoir été le maître d’Augustin d’Hippone, vraisemblablement
avant son séjour à Cassiciacum (Ambroise de Milan). Augustin était passé
par la philosophie (il avoue dans les {\em Confessions}) avoir lu les
\quote{libri platonicorum}.
La réponse contemporaine est donnée par Émile Bréhier, protagoniste du
débat avec Étienne Gilson.
La {\em théologie} des premiers siècles a trouvé chez les stoïciens le
vocabulaire pour penser la grâce.
\stopbuffer
% Processing + composition of the buffer
\ctxlua{userdata.typeset_with_autoindex(“autotext”)}
\page
\startbackmatter
\startchapter[title={\sc Index}]
\startsection[title={{\em Index nominum}}]
\placeregister[nomen][columns=2,balance=yes,criterium=all,title={{Index
nominum}}]
\stopsection
\page
\startsection[title={{\em Index rerum}}]
\placeregister[rerum][columns=2,balance=yes,criterium=all,title={{Index
rerum}}]
\stopsection
\stopchapter
\stopbackmatter
\stoptext
Le 30/09/2025 à 21:33, Jean-Pierre Delange via ntg-context a écrit :
Sorry! The message was sent too early! Here is the MWE :
\setuplayout[width=middle, backspace=20mm, topspace=15mm, height=middle]
\setupinteraction[state=start]
\setuplanguage[fr][patterns={fr,agr}]
\mainlanguage[fr]
\setcharacterspacing[frenchpunctuation]
% --- Registres (définition + style + à l’arrêt au préambule)
--------------
\defineregister[nomen]
\defineregister[rerum]
\setupregister[nomen][indicator=yes, balance=yes, compress=yes,
state=stop]
\setupregister[rerum][indicator=yes, balance=yes, compress=yes,
state=stop]
% === Auto-index (Lua) : détection & \setregisterentry
======================
\startluacode
local collecting = false
local in_lua = false
-- Charger une liste "clé|affichage" (ou "terme" = clé=affichage)
local function loadlist(name)
local t = {}
local found = (resolvers and resolvers.findfile and
resolvers.findfile(name)) or name
local f = io.open(found, "r")
if not f then return t end
for line in f:lines() do
local s = line:gsub("^%s+", ""):gsub("%s+$", "")
if s ~= "" and s:sub(1,1) ~= "#" then
local k, d = s:match("^(.-)|(.+)$"); if not k then k, d = s, s end
t[#t+1] = { key = k, disp = d }
end
end
f:close()
-- matcher d'abord les plus longs pour éviter les recouvrements
table.sort(t, function(a,b) return #a.disp > #b.disp end)
return t
end
local NOMEN = loadlist("nomen-list.txt")
local RERUM = loadlist("rerum-list.txt")
-- Échapper un motif Lua + tolérer quelques variantes
local function esc_pattern(s)
local p = s:gsub("([%%%^%$%*%(%)%+%-%[%]%?%.])","%%%1")
p = p:gsub("'", "['’']") -- apostrophes
p = p:gsub("’", "['’']")
p = p:gsub("%%%-", "[-–—]") -- tirets courts/longs
return p
end
-- Injecter une entrée \setregisterentry à chaque occurrence trouvée
local function add_entries(buf, list, reg)
for i=1,#list do
local key = list[i].key
local disp = list[i].disp
if disp and disp ~= "" then
local pat = esc_pattern(disp)
local cmd =
"\\setregisterentry["..reg.."][keys:1={"..key.."}][entries:1={"..disp.."}]"
-- n'imprime rien : \setregisterentry ne produit pas de texte
buf = buf:gsub("("..pat..")", "%1"..cmd)
-- Debug (facultatif) :
-- local n; buf, n = buf:gsub("("..pat..")", "%1"..cmd); if n>0 then
logs.report("autoindex","%s: %s -> %d", reg, disp, n) end
end
end
return buf
end
-- Filtre (processeur d’entrée) : travaille seulement entre
\starttext...\stoptext et hors \start/stopluacode
local function filter(buf)
if buf:find("\\startluacode") then in_lua = true end
if buf:find("\\stopluacode") then in_lua = false end
if buf:find("\\starttext") then collecting = true end
if buf:find("\\stoptext") then collecting = false end
if not collecting or in_lua then return buf end
buf = add_entries(buf, NOMEN, "nomen")
buf = add_entries(buf, RERUM, "rerum")
return buf
end
-- Enregistrement du processeur (LMTX)
if input and input.processors and input.processors.add then
input.processors.add("autoindex", filter)
end
\stopluacode
% Activer le processeur (sinon, il ne tournera pas)
\enabledirectives[input.processors=autoindex]
\starttext
% Démarre l’accumulation des entrées à partir d’ici
\setupregister[nomen][state=start]
\setupregister[rerum][state=start]
\subject{Paragraph 1 (names : Bréhier, Cassiciacum, Augustin)}
Qui plus est, on peut ajouter qu'il (Cicéron) a été, comme Virgile,
l'un des {\em éducateurs} de l'Europe,
après avoir été le maître d'Augustin d'Hippone — lequel l'a lu très
jeune, vraisemblablement
avant son séjour à Cassiciacum, entouré de sa famille et de ses amis.
En second lieu, la réponse
contemporaine est donnée par Émile Bréhier, historien de la
philosophie, protagoniste du débat
qu'il engagea avec Étienne Gilson au début du XX\high{e} siècle.
\blank[2*big]
\subject{Paragraph 2 (notions : hérétiques, stoïciens, théologie ; nom
: Thomas d'Aquin)}
… la {\em théologie} chrétienne des premiers siècles a trouvé chez les
penseurs aristotéliciens et
stoïciens (stoïciens) le vocabulaire pour penser la grâce, la nature,
la vertu. Cela pourrait s'entendre
si, par exemple, Thomas d'Aquin avait été le précurseur de «
l'aristotélisme chrétien »…
\page
\startbackmatter
\startchapter[title={\sc Index}]
\startsection[title={{\em Index nominum}}]
\placeregister[nomen][columns=2,balance=yes,criterium=all,title={{Index
nominum}}]
\stopsection
\page
\startsection[title={{\em Index rerum}}]
\placeregister[rerum][columns=2,balance=yes,criterium=all,title={{Index
rerum}}]
\stopsection
\stopchapter
\stopbackmatter
\stoptext
Le 30/09/2025 à 21:24, Jean-Pierre Delange via ntg-context a écrit :
Hello everyone,
I am trying to automatically build *two indexes* (nominum & rerum)
from two lists (|nomen-list.txt| and |rerum-list.txt|) *without
having to* manually tag the text.
* *ConTeXt LMTX* (ConTeXt Process Management 1.06
* mtx-context | current version: 2025.07.27 21:43
* OS: Windows
* Compile commands:
|context --purgeall MWE_setregisterentry.tex|
|context MWE_setregisterentry.tex|
Approach:
* I use *|input.processors.add(“autoindex”, filter)|* +
|\enabledirectives[input.processors=autoindex]| (no
|callbacks.register|).
* In the processor, I search for *literal* occurrences (accents OK;
|'|/|’| and |-|/|–|/|—| tolerated).
* Instead of injecting the classic command |\index[...]|, I call
*|\setregisterentry|* to properly register the entry (key/display).
* The registers are defined in the preamble, |state=stop| then
*|state=start| just after |\starttext|*, and printed with
|\placeregister[...]| in the backmatter.
Problem:
* The MWE *compiles perfectly* (PDF produced, pages OK) but *both
indexes remain empty*.
* I wonder if the processor injects the command during a *preroll*
(or “trial typesetting”), which would cause the entry to be
“lost.” Is there a recommended *flag* (e.g., |\ifprerolling| or
equivalent) to *neutralize* |\setregisterentry| during prerolls
and execute it *only at the right time*?
Questions:
1. Is the approach via |input.processors.add| +
|\enabledirectives[input.processors=autoindex]| the *right* one
for LMTX (rather than the old callbacks)?
2. Is there a *recommended method* for calling |\setregisterentry|
from an input processor in a *reliable* way?
3. Is the signature
|\setregisterentry[reg][keys={...}][entries={...}]| preferable to
the /n/-level form (|keys:1|, |entries:1|) in this use case?
4. Would you recommend *anchoring* detection on *short units* (e.g.,
“Aquin,” “Augustin”) and then forcing the key/display for sorting
(|Aquin, Thomas|Thomas d'Aquin|)? I know: French diacritical
marks, apostrophes, and other characters add an additional
constraint to the indexing of proper names and nouns (see the
word “aujourd'hui” and others).
Thank you in advance for your insights!
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the
Wiki!
maillist :[email protected]
/https://mailman.ntg.nl/mailman3/lists/ntg-context.ntg.nl
webpage :https://www.pragma-ade.nl /https://context.aanhet.net (mirror)
archive :https://github.com/contextgarden/context
wiki :https://wiki.contextgarden.net
___________________________________________________________________________________
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the
Wiki!
maillist :[email protected]
/https://mailman.ntg.nl/mailman3/lists/ntg-context.ntg.nl
webpage :https://www.pragma-ade.nl /https://context.aanhet.net (mirror)
archive :https://github.com/contextgarden/context
wiki :https://wiki.contextgarden.net
___________________________________________________________________________________
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the
Wiki!
maillist : [email protected] /
https://mailman.ntg.nl/mailman3/lists/ntg-context.ntg.nl
webpage : https://www.pragma-ade.nl / https://context.aanhet.net (mirror)
archive : https://github.com/contextgarden/context
wiki : https://wiki.contextgarden.net
___________________________________________________________________________________