Hi Carsten,

Thanks for the explanation. It only confirms my opinion that this
isn't a topic for a policy document.

Regards/Ngā mihi
   Brian

On 28-Oct-25 10:34, Carsten Bormann wrote:

Anyone who has recently used well known search tools, especially those
"enhanced" by AI, will have a good laugh at "searches ... should
return accurate results". I really don't see much point in the whole
sentence, since it appears to set requirements for all search engines.

Let me try to explain the background here.

There are various ways in which XML representation (and possibly the resulting 
rendering) can get in the way of classical search tools (such as grep or 
searches in a programmer’s editor).

We sometimes have to use Unicode characters to control distracting line 
breaking, such as changing a hyphen-minus into a non-breaking hyphen (U+2011), 
or inserting word joiners and similar characters (*).
That can get in the way of searches that assume the original hyphen or the 
absence of word joiners.
The plain text renderer of xml2rfc often can remove (or revert) these helping 
changes during its formatting stage, so the .txt is clean.
The HTML renderer can’t, because the browser needs them to do line-breaking as 
it owns the text formatter we use.

Of course, we could add a <nobr span element to avoid having to do this 
specific case via Unicode characters, but I think this is a more general problem.

(There also are svg tools that can make searches for the text in the diagrams 
harder, but I think these are not 7997bis matter.)

Grüße, Carsten

(*) For example, S/MIME gets broken after the S/ in xml2rfc unless you insert a 
word joiner.
(Of course, kramdown-rfc can do that for you automatically in case you are 
writing a longer document about S/MIME — so you can still search the markdown 
:-).)

--
rswg mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to