On Thu, Oct 8, 2015 at 8:11 PM, Nathan Wagner <nw...@hydaspes.if.org> wrote:
> On Wed, Oct 07, 2015 at 03:06:50PM -0400, Stephen Frost wrote: > > * Nathan Wagner (nw...@hydaspes.if.org) wrote: > > > I have added full text searching to my tracker. I only index the first > > > 50 KB of each message. There's apparently a one MB limit on that > > > anyway, which a few messages exceed. I figure anything important is > > > probably in the first 50KB. I could be wrong. I could re-index fairly > > > easily. It seems to work pretty well. > we have a patch, which eliminates 1MB limit, will be published soon. > > > > Note that we have FTS for the -bugs, and all the other, mailing lists.. > > True, but that finds emails. The search I have finds bugs (well, bug > reports > anyway). Specifically, I have the following function: > > create or replace function bugvector(bugid bigint) > returns tsvector language 'sql' as $$ > select tsvagg( > setweight(to_tsvector(substr(body(msg), 1, 50*1024)), 'D') > || > setweight(to_tsvector(header_value(msg, 'Subject')), 'C') > ) > from emails > where bug = $1 > $$ strict; > > which, as you can see, collects into one tsvector all the emails associated > with that particular bug. So a search hit is for the whole bug. There's > probably some search artifacts here. I suspect a bug with a long email > thread > will be ranked higher than a one with a short thread. Perhaps that's ok > though. > > it's possible to write bugs specific parser for fts. Also, order results by date submitted, so we always will have originated message first. > -- > nw > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >