Reading through this thread my instinct is that we're over complicating
and over-specifying things. It's quite possible that each server will
want to have their own search syntax, and I don't think we need to
expose any specific syntax (eg. a simple search vs. advanced search) in
MAM. Instead, maybe it would be better to let the server send a URL that
can be displayed next to the search box that links to a help page with a
description of the syntax they use?

On Fri, Jun 3, 2022, at 05:50, Matthew Wild wrote:
> Hi folks,
>
> Thanks to Guus's persistence, I finally took time to close a few
> issues I have with the current XEP-0431 (Full Text Search in MAM).
>
> The main issue is that the current version of the spec provides no
> guarantees about how the search string (generally input from a
> user) will be interpreted. Usually in such cases, I would say this
> is fine... an implementation that returns all messages containing
> "bar" when you submit a search for "foo" is obviously broken and
> nobody would want to use it, even if it's 100% permitted behaviour
> by the XEP.
>
> But full-text search is actually a complex topic, and there are
> various backend implementations that servers are likely to lean on.
> Each of them has a different search syntax, and there is no way (in an
> open ecosystem) for a user to know which of these may be used.
>
> My proposal does two things to fix this situation:
>
>   1) Add a "simple" search type, which is recommended to be
>      implemented as a baseline for interoperability. For simple
>      searches, the server promises that no search terms or symbols
>      will be interpreted as special syntax - what you search is what
>      you get.
>
>   2) Extend the existing ("advanced") search field with a
>      recommendation that the server includes a <desc> element (already
>      defined in XEP-0004) to explain the supported syntax to the user,
>      and an (entirely optional) machine-readable hint that can be used
>      to indicate to a client that a commonly-used syntax is supported.
>
> Finally, most full-text search engines are not language-agnostic. This
> is because they perform operations such as stemming, and utilize a
> "stop word" list while building the index to help improve the search
> results. Many default to English, and while searches in other
> languages generally work, they may be silently worse. I've added an
> optional tag through which the server can indicate the natural
> languages that the search is optimized for. I feel least strongly
> about this addition, since this information is usually going to be
> apparent to the user already based on the context.
>
> Commit:
> https://github.com/xsf/xeps/compare/master...mwild1:xep-0431-v0.3.0?expand=1
> Rendered: https://matthewwild.co.uk/uploads/xeps/xep-0431.html
>
> Feedback welcome, including from Dave (document's author) who I
> haven't consulted about these changes. If there are no objections,
> I'll raise a PR soon.
>
> Regards, Matthew
> _______________________________________________
> Standards mailing list Info:
> https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: Standards-
> unsubscr...@xmpp.org
> _______________________________________________

-- 
Sam Whited
_______________________________________________
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
_______________________________________________

Reply via email to