--On Monday, October 27, 2025 22:34 +0100 Carsten Bormann
<[email protected]> wrote:

> 
>>> Anyone who has recently used well known search tools, especially
>>> those "enhanced" by AI, will have a good laugh at "searches ...
>>> should return accurate results". I really don't see much point in
>>> the whole sentence, since it appears to set requirements for all
>>> search engines.
> 
> Let me try to explain the background here.
> 
> There are various ways in which XML representation (and possibly
> the resulting rendering) can get in the way of classical search
> tools (such as grep or searches in a programmer's editor).
> 
> We sometimes have to use Unicode characters to control distracting
> line breaking, such as changing a hyphen-minus into a non-breaking
> hyphen (U+2011), or inserting word joiners and similar characters
> (*). That can get in the way of searches that assume the original
> hyphen or the absence of word joiners. The plain text renderer of
> xml2rfc often can remove (or revert) these helping changes during
> its formatting stage, so the .txt is clean. The HTML renderer
> can't, because the browser needs them to do line-breaking as it
> owns the text formatter we use.
> 
> Of course, we could add a <nobr span element to avoid having to do
> this specific case via Unicode characters, but I think this is a
> more general problem.
> 
> (There also are svg tools that can make searches for the text in
> the diagrams harder, but I think these are not 7997bis matter.)
> 
> Grüße, Carsten
> 
> (*) For example, S/MIME gets broken after the S/ in xml2rfc unless
> you insert a word joiner.   (Of course, kramdown-rfc can do that
> for you automatically in case you are writing a longer document
> about S/MIME — so you can still search the markdown :-).)

And, of course, several special characters like those you have
mentioned (word joiners, etc., to say nothing of directionality
affecters if those are similarly needed, are not "displayable
characters" in the usual sense of that term.  So this discussion
affects more than just search.

   john

-- 
rswg mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to