I just tested Paul's latest patch (minus the is_header code updates) and Mikhail's patch with the non-cvs 2.0 package and both coexist nicely. =)
Since you guys seem to want to have is_header header used asap to optimize processing time, the following mysql query could help in the updade/upgrade process: update `dbmail_messageblks` set is_header = 1 where where left( messageblk, 9 ) = 'Received:'; I'm assuming all the header blocks start with 'Received:'. Xing > Xing, > > > I have fixed my previously untested patch. Now it works nicely. > > > > > [EMAIL PROTECTED] wrote: >> I had exact same slowness problem after upgrading to 2.0 via imap last >> night. I sent a report to the users list before I saw this thread. >> >> If the following new query is to be used to optimize the imap: >> >> >> select count(message_idnr), count(message_idnr) - sum(seen_flag), >> sum(recent_flag) from %smessages where mailbox_idnr = '%llu' and status < >> '%d'; >> >> >> dbmail should have an index, at least for mysql that index both column >> (mailbox_idnr and status). I was looking at some of the debug queries >> and after add a (mailbox_idnr, status, uniqueid) combo index, mysql was >> able to trim the records need to file sort from 21000 to 17000 from of >> the queries which use all of these 3 fields. 4000/21000 = 19% >> decrease in the records mysql has to manually sort through. >> >> Xing >> >> >> Paul J Stevens wrote: >> >> >>> >>> >>> >>> Mikhail Ramendik wrote: >>> >>> >>>> Hello, >>>> >>>> >>>> I have imported a large (about 50,000 messages) folder into dbmail >>>> - and >>>> got really SLOW performance. Opening the folder with a IMAP client >>>> (Evolution, on the same local machine) takes ages! The promised 250 >>>> messages/second are just not there. >>> >>> >>> >>> Painfully true. >>> >>> >>>> I have analyzed the logs and looked at the code. And I can clearly >>>> see where the bottlenecks are. (This is why I am writing to the >>>> developers list.) >>>> >>>> However, I'm not an expert in database programming. So while >>>> finding the problems was easy, I'd prefer it if someone else could >>>> fix them :) I have some ideas how to do this; but this would be a >>>> "last resort". >>>> >>> >>> >>> >>> Please do share your ideas. Anything that will boost perfomance will >>> be seriously considered. >>> >>> >>>> >>>> So, when the folder gets opened, first the system spends some >>>> minutes with dbmail-imapd hogging the CPU (a Celeron 2400). And here >>>> are the log entriesbetween which this happens: >>>> >>>> Oct 23 11:42:28 localhost dbmail/imap4d[2762]: dbmysql.c,db_query: >>>> executing query [SELECT message_idnr, seen_flag, recent_flag FROM >>>> dbmail_messages WHERE mailbox_idnr = '9' AND status < '2' AND >>>> unique_id != '' ORDER BY message_idnr ASC] >>>> Oct 23 11:44:19 localhost dbmail/imap4d[2762]: dbmysql.c,db_query: >>>> executing query [SELECT MAX(message_idnr) FROM dbmail_ messages WHERE >>>> unique_id != ''] >>>> >>>> I have found the corresponding place in the code and can see the >>>> CPU hog >>>> there: >>>> >>>> >>>> (db.c line 2534) >>>> >>> >>> >>> >>> Mmm, for me this is around 2340 more like. >>> >>> >>>> >>>> /* alloc mem */ >>>> mb->seq_list = (u64_t *) my_malloc(sizeof(u64_t) * mb->exists); if >>>> (!mb->seq_list) { >>>> /* out of mem */ >>>> db_free_result(); return -1; } >>>> >>>> >>>> for (i = 0; i < db_num_rows(); i++) { if (db_get_result(i, 1)[0] == >>>> '0') >>>> mb->unseen++; if (db_get_result(i, 2)[0] == '1') mb->recent++; >>>> >>>> mb->seq_list[i] = db_get_result_u64(i, 0); } >>>> >>>> >>>> db_free_result(); >>>> >>>> Well, with db_num_rows() in the tens of thousands one would expect >>>> a CPU hog here! Three calls to db_get_result for every message - and >>>> db_get_result performs seeking every single time, too. >>> >>> >>> >>> And only to count the number of recent_flags, and seen_flags for a >>> certain subset of message_idnrs all those message rows are actually >>> selected !! Clearly usage of COUNT() comes to mind as an >>> optimization. >>> >>> therefore, where in db_getmailbox() the following code is used to >>> obtain the total number of messages, the number of unseen, and number >>> of recent messages: >>> >>> /* select messages */ >>> snprintf(query, DEF_QUERYSIZE, "SELECT message_idnr, seen_flag, >>> recent_flag " "FROM %smessages WHERE mailbox_idnr = '%llu' " >>> "AND status < '%d' " >>> "ORDER BY message_idnr ASC",DBPFX, mb->uid, >>> MESSAGE_STATUS_DELETE); >>> >>> >>> we would be better of to use: >>> >>> select count(message_idnr), count(message_idnr) - sum(seen_flag), >>> sum(recent_flag) from %smessages where mailbox_idnr = '%llu' and status >>> < '%d'; >>> >>> >>> and defer fetching the list of message_idnr to a separate query. This >>> will reduce the number of db_get_result calls in this function by >>> approx. 66%, which must be good for large folders. >>> >>> Thanks for pointing this out. I have this implemented already, if >>> this works out, I'll commit this change next week. >>> >>> [snip a typical message-parsing fetch run] >>> >>> >>>> >>>> And so it goes on, doing four queries per message! No wonder it can >>>> only process about 20 messages per second, instead of 250 as >>>> promised in the README. >>>> >>> >>> >>> >>> Yep. You hit dbmail's sorest spot called _ic_fetch. Not so easy to >>> fix. And something I've been working on for some months now. Using a >>> better mime-parser will give us better model-view separation. We also >>> need a server-side list-view of messages as presented to the >>> controller (imapclient). I was working on a separate branch of the >>> code to investigate some of these issues but have now abbandoned the >>> branch in favor of using smaller atomic change-sets in separate >>> patches. Debian's dpatch tool really rules here for me, though the >>> upcoming switch to subversion may well have similar advantages. >>> >>> >>>> This actually makes dbmail unusable for me as a local storage - and >>>> eats away performance in other cases too, though it may not be that >>>> noticeable. >>> >>> >>> >>> Try searching large folders for specific body content.... and wait... >>> and wait.... Something that makes dbmail currently all but unusable >>> for client-side filtering. >>> >>> >>> >> >> _______________________________________________ >> Dbmail-dev mailing list >> Dbmail-dev@dbmail.org >> http://twister.fastxs.net/mailman/listinfo/dbmail-dev >> >> > > -- > ________________________________________________________________ > Paul Stevens [EMAIL PROTECTED] > NET FACILITIES GROUP GPG/PGP: 1024D/11F8CD31 > The Netherlands_______________________________________www.nfg.nl > > > _______________________________________________ > Dbmail-dev mailing list > Dbmail-dev@dbmail.org > http://twister.fastxs.net/mailman/listinfo/dbmail-dev > > >