https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6444
--- Comment #12 from Bradley Kieser <[email protected]> 2010-06-10 17:03:51 EDT --- (In reply to comment #7) > Created an attachment (id=4773) --> (https://issues.apache.org/SpamAssassin/attachment.cgi?id=4773) [details] > optimized patch > > I did some benchmarking on a smallish database (started it few days ago, > thanks to enhancement in Bug 6447 it only accumulated 5000 tokens so far), > and the Bradley's patch doesn't fare too well. Turns out it assembles the > SQL clause for every token, and it unnecessarily updates newest_token_age > once for each token. Also the sort is probably redundant. > > I factored out the invariant operations from Bradley's proposal, which > resulted in the attached patch - and it became about 10 times faster > for a message that needed to update 150 tokens. The patch also adds > tok_get_all and tok_touch_all timing measurements to the timing report. > > Probably because of the small set of tokens in my database, the original > code did even a little bit better than my patched code, although I believe > that the difference can turn the other way around as reported for a large > database. > > Here are times in milliseconds for a tok_touch_all() which needed > to update 150 tokens each time (several runs): > > original tok_touch_all: > tok_touch_all: 33 > tok_touch_all: 16 > tok_touch_all: 7 > tok_touch_all: 7 > tok_touch_all: 21 > tok_touch_all: 19 > tok_touch_all: 12 > tok_touch_all: 6 > tok_touch_all: 12 > tok_touch_all: 6 > tok_touch_all: 29 > > new(Mark) tok_touch_all: > tok_touch_all: 35 > tok_touch_all: 40 > tok_touch_all: 68 > tok_touch_all: 42 > tok_touch_all: 33 > tok_touch_all: 39 > tok_touch_all: 48 > tok_touch_all: 33 > > new(Bradley) tok_touch_all: > tok_touch_all: 413 > tok_touch_all: 330 > tok_touch_all: 253 > tok_touch_all: 525 > tok_touch_all: 579 > tok_touch_all: 248 > tok_touch_all: 329 > tok_touch_all: 753 Please see my comment below. I forgot, when I submitted the patch, to include the crucial index that is needed as well: create index bayes_token_idx1 on bayes_token(token); -- Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug.
