Reading through this thread my instinct is that we're over complicating and over-specifying things. It's quite possible that each server will want to have their own search syntax, and I don't think we need to expose any specific syntax (eg. a simple search vs. advanced search) in MAM. Instead, maybe it would be better to let the server send a URL that can be displayed next to the search box that links to a help page with a description of the syntax they use?
On Fri, Jun 3, 2022, at 05:50, Matthew Wild wrote: > Hi folks, > > Thanks to Guus's persistence, I finally took time to close a few > issues I have with the current XEP-0431 (Full Text Search in MAM). > > The main issue is that the current version of the spec provides no > guarantees about how the search string (generally input from a > user) will be interpreted. Usually in such cases, I would say this > is fine... an implementation that returns all messages containing > "bar" when you submit a search for "foo" is obviously broken and > nobody would want to use it, even if it's 100% permitted behaviour > by the XEP. > > But full-text search is actually a complex topic, and there are > various backend implementations that servers are likely to lean on. > Each of them has a different search syntax, and there is no way (in an > open ecosystem) for a user to know which of these may be used. > > My proposal does two things to fix this situation: > > 1) Add a "simple" search type, which is recommended to be > implemented as a baseline for interoperability. For simple > searches, the server promises that no search terms or symbols > will be interpreted as special syntax - what you search is what > you get. > > 2) Extend the existing ("advanced") search field with a > recommendation that the server includes a <desc> element (already > defined in XEP-0004) to explain the supported syntax to the user, > and an (entirely optional) machine-readable hint that can be used > to indicate to a client that a commonly-used syntax is supported. > > Finally, most full-text search engines are not language-agnostic. This > is because they perform operations such as stemming, and utilize a > "stop word" list while building the index to help improve the search > results. Many default to English, and while searches in other > languages generally work, they may be silently worse. I've added an > optional tag through which the server can indicate the natural > languages that the search is optimized for. I feel least strongly > about this addition, since this information is usually going to be > apparent to the user already based on the context. > > Commit: > https://github.com/xsf/xeps/compare/master...mwild1:xep-0431-v0.3.0?expand=1 > Rendered: https://matthewwild.co.uk/uploads/xeps/xep-0431.html > > Feedback welcome, including from Dave (document's author) who I > haven't consulted about these changes. If there are no objections, > I'll raise a PR soon. > > Regards, Matthew > _______________________________________________ > Standards mailing list Info: > https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: Standards- > unsubscr...@xmpp.org > _______________________________________________ -- Sam Whited _______________________________________________ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org _______________________________________________