On Sun, Nov 6, 2016 at 9:27 AM Daniel Gruno <[email protected]> wrote:

> On 11/06/2016 03:18 PM, sebb wrote:
> > Fields such as message-id are stored as text strings, but they are
> > only really intended to be used as ids. They don't contain independent
> > text parts.
> >
> > From what I have understood so far from reading the ES docs, such
> > fields should be tagged as
> >
> > "index": "not_analyzed"
> >
> > AIUI this reduces the analysis overhead and storage requirements, and
> > also makes it harder to find fields with
> > This probably applies to other fields in "mbox":
> >
> > mid
> > possibly in-reply-to
> > also references
> >
> > And of course the auto-created fields such as attachments
> >
> > Likewise the doc types currently missing from setup.py:
> >
> > notifications
> > account
> > mailinglists
> >
> > These are internal use only so are not intended for searching.
> >
> > Or have I got this completely wrong?
> >
>
> message-id is set to not be analyzed, by the setup script (it's in the
> mappings it sends to ES when creating the index). mid and in-reply-to
> should probably also be not analyzed, although mid is really a copy of
> the doc ID, IIRC. the list ID is also not analyzed by default (as
> list_raw), neither is the raw from address
>

So I notice the query process is an arbitrary full text query, which runs
against _all.
https://github.com/apache/incubator-ponymail/blob/master/site/api/lib/elastic.lua#L44
unless
I need to dig into it a bit further to see if there's something building up
query a bit different.

So... that means most of these mappings are moot.

Reply via email to