Hello, I have some further ideas on _ic_fetch() speedup, both for the time when we have is_header and, as I spent some more time thinking, for the current situation as well.
But I'm afraid I can't code them :( Complicated work in C turned out to be too much for me. I'm a casual programmer only (a tech writer by profession), I did relatively much coding in my life but never in C. I'll keep my coding to simple things; for example, I'll try to fix the UID SEARCH UID 1:* failure (this prevents communication with Sylpheed). But I can't handle serious FETCH and SEARCH speedups. The one I did was relatively simple, for example, it never touched anything containing allocaion and freeing of buffers... The SEARCH speedup idea (regexp search) was already discussed; FETCH was only mentioned in broad terms. So I decided to describe the FETCH ideas here; I really hope they will be of use. The main cause of FETCH relative slowness is, I think, the fact that it uses many queries. This can also cause system load problems when many people do FETCHing at once. While other things like the header parse loop also take some time, the bulk is spent in the queries. So we need to reduce the number of queries. An ideal solution would join the entire FETCH into one query. But this would require additions on the database layer, because queries like that can produce *big* result sets. At least for MySQL, we currently use mysql_store_result, which loads the entire result set at once; we'd have to add a way to use mysql_use_result. But, one can also do much without going that far - at least when FETCHing only the headers. This solution would involve a load of several sets of headers into a buffer, and then using up the buffered data before the next query. While we don't have is_header, I would be wary of queries that involve more than 5 or 6 messages. It's easy to query for just the header of *one* message (just add LIMIT 1), but a query for several messages will have to include the bodies, which can be up to several megabytes each sometimes - and we still use mysql_store_result. A large query would result in a RASM hog every time we hit a series of big messages. But querying for 5 messages and storing the headers would work even without is_header. This is the query I'm thinking of: (line wrapped for clarity; warning: not tested) SELECT msg.message_idnr, blk.messageblk_idnr, blk.messageblk FROM dbmail_messages msg, dbmail_messageblks blk WHERE msg.physmessage_id = blk.physmessage_id AND msg.message_idnr BETWEEN '%llu' AND '%llu' AND msg.mailbox_idnr='%llu' ORDER BY msg.message_idnr ASC, blk.messageblk_idnr ASC; Then we loop through the result set, only use the first messageblk for each message_indr, parse the headers, and fetch them to the client. This can be implemented in the current code, and I suspect it can speed things up (and make them scale much better) even at this stage. When we finally have is_header, we can only query for headers. Then we can query for, and buffer, something like 50 or 100 headers at one time, with no changes to the database layer in the code.A header will not take more than several kilobytes of RAM, so both the quering and the buffering bbecome easy on RAM. These ideas are for the current code base. I just won't be able to handle those buffers the right way, tracking each pointer to a free() in all the code; so I won't be able to code them. I hope someone here is better at C than me :) Yours, Mikhail Ramendik
