On Sun, Nov 6, 2016 at 8:22 PM sebb <[email protected]> wrote: > On 6 November 2016 at 14:37, John D. Ament <[email protected]> wrote: > > On Sun, Nov 6, 2016 at 9:27 AM Daniel Gruno <[email protected]> > wrote: > > > >> On 11/06/2016 03:18 PM, sebb wrote: > >> > Fields such as message-id are stored as text strings, but they are > >> > only really intended to be used as ids. They don't contain independent > >> > text parts. > >> > > >> > From what I have understood so far from reading the ES docs, such > >> > fields should be tagged as > >> > > >> > "index": "not_analyzed" > >> > > >> > AIUI this reduces the analysis overhead and storage requirements, and > >> > also makes it harder to find fields with > >> > This probably applies to other fields in "mbox": > >> > > >> > mid > >> > possibly in-reply-to > >> > also references > >> > > >> > And of course the auto-created fields such as attachments > >> > > >> > Likewise the doc types currently missing from setup.py: > >> > > >> > notifications > >> > account > >> > mailinglists > >> > > >> > These are internal use only so are not intended for searching. > >> > > >> > Or have I got this completely wrong? > >> > > >> > >> message-id is set to not be analyzed, by the setup script (it's in the > >> mappings it sends to ES when creating the index). mid and in-reply-to > >> should probably also be not analyzed, although mid is really a copy of > >> the doc ID, IIRC. the list ID is also not analyzed by default (as > >> list_raw), neither is the raw from address > >> > > > > So I notice the query process is an arbitrary full text query, which runs > > against _all. > > > https://github.com/apache/incubator-ponymail/blob/master/site/api/lib/elastic.lua#L44 > > Huh? > > The query starts: > > local url = config.es_url .. doc .. "/_search?q="..query > > where > > es_url = "http://localhost:9200/ponymail/" > > and > > doc = "mbox" by default. > > Where does the _all come in? >
When you do a query string query in elastic search (reference: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html) the default field unless specified is "_all". I can't find anything in the pony code that changes this field. As a result, its going to search _all by default. > > > unless > > I need to dig into it a bit further to see if there's something building > up > > query a bit different. > > > > So... that means most of these mappings are moot. >
