Re: [Dbmail-dev] search to optimize _ic_search

Micah Thu, 10 Jun 2004 02:08:49 +0200 (CEST)

Is the header block always the first block inserted? If so you can find it by 
ordering the blocks by idnr and just taking the first one. No extra field 
necessary..


I don't think there would be a performance difference at all between the two 
methods and it would keep the table structure simpler, which to me is better. 

Just a thought.

-Micah 


On Wednesday 09 June 2004 05:00 pm, Aaron Stone wrote:
> Off the top of my head, there's a two step process here. First is that the
> top level, messgae-oriented functions will need an extra flag. Second is
> that the inner physmessage functions will need that flag, too.
>
> db_insert_message_block() and db_insert_message_block_physmessage().
>
> I don't think we should wait on 2.0 for this. We can tweak performance in
> the point releases. We should add whatever columns we'll need to support
> the functionality, though.
>
> Aaron
>
> Paul J Stevens <[EMAIL PROTECTED]> said:
> > Hi Ilja,
> >
> > The wiki looks pretty much empty so I just started playing around.
> >
> > I've been testing with a messageblks.is_header field so db_fetch_headers
> >   will not retrieve the whole message before parsing.
> >
> > Looks like we may get something like a factor 2 improvement in searching
> > headers.
> >
> > I still have to figure out how to modify message insertion to use such a
> > field. Perhaps you or Aaron have some bright ideas.
> >
> > For now I've modified the database by hand using group by queries to
> > locate messageblk_idnr for the header block and a shell-script to
> > generate update queries.
> >
> > queries used:
> >
> > alter table messageblks add is_header int(1) not null default 0;
> >
> > update the messageblk table:
> >
> > #> mysql --skip-column-names -B -e "select messageblk_idnr from
> > messageblks group by physmessage_id" dbmail |awk 'BEGIN {
> > printf("\nupdate messageblks set is_header=1 where messageblk_idnr in
> > ("); }   { if(NR % 200 == 0) { printf("\nupdate messageblks set
> > is_header=1 where messageblk_idnr in ("); i=0; } else {
> > printf("%s,",$1); }}'|sed 's/,$/);/' | mysql dbmail
> >
> >
> > #> mysql -B -e "select messageblk_idnr from messageblks group by
> > physmessage_id" dbmail |awk '{ if(NR % 50 == 0) { printf("\nupdate
> > messageblks set is_header=1 where messageblk_idnr in ("); i=0; } else {
> > printf("%s,",$1); }}'|sed 's/,$/);/'
> >
> > Ilja Booij wrote:
> > > Paul J Stevens wrote:
> > >> Hi all,
> > >>
> > >> I'm looking to optimize _ic_search without resorting to drastics.
> > >>
> > >> I noticed dbmsgbuf.c retrieves the full message from the database in
> > >> two functions. In both cases first the physmessage_id is queried using
> > >> the msgid, followed by a query on the messageblks table using the
> > >> physmessage_id. Two queries where one suffices. A simple but effective
> > >> improvement I think. Query syntax (left join ... using ...) checked on
> > >> mysql/postgres (dunno about oracle or ansi-sql).
> > >>
> > >> I also fixed some of the FIXME's ...
> > >>
> > >> Ilja?
> > >
> > > Looks good. I'v committed the patch.
> > >
> > >> PS. Of course searching still sucks major :-( until we get some real
> > >> header caching...
> > >
> > > true..
> > >
> > > Ilja
> > > _______________________________________________
> > > Dbmail-dev mailing list
> > > [email protected]
> > > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> >
> > --
> >    ________________________________________________________________
> >    Paul Stevens                                  mailto:[EMAIL PROTECTED]
> >    NET FACILITIES GROUP                     PGP: finger [EMAIL PROTECTED]
> >    The Netherlands________________________________http://www.nfg.nl

Re: [Dbmail-dev] search to optimize _ic_search

Reply via email to