On Mon, Nov 06 2006, Katsumi Yamaoka wrote: >>>>>> In <[EMAIL PROTECTED]> Richard Stallman wrote: > >> Scoring of the messages closer to the beginning of the buffer is fast, >> but as we move to higher-numbered messages, that are closer to the end >> of such big files/buffers, gnus will only score 2-3 messages per >> minute, and that's what kills performance. [...] > (setq gnus-article-button-face nil > gnus-signature-face nil > gnus-summary-selected-face nil > gnus-treat-highlight-citation nil > gnus-treat-emphasize nil) > > If it makes Gnus fast, improving the performance will be worth > trying. However, I didn't feel any difference, though it might > be because I don't have huge mail folders.
I don't think this matches the problem description. When scanning big mbox files, article display isn't involved. Or am I missing something? My guess is that it's problem with case-fold-search when searching for "X-Gnus-Article-Number" in mbox files in Emacs 22 as analyzed by Elias Oltmanns back in June: ,----[ http://thread.gmane.org/gmane.emacs.devel/53901/focus=54013 ] | From: Elias Oltmanns <oltmanns <at> uni-bonn.de> | Subject: Re: New buffer-case-table makes search_buffer painfully slow | Newsgroups: gmane.emacs.devel | Date: 2006-05-06 19:10:08 GMT | | Elias Oltmanns <oltmanns <at> uni-bonn.de> wrote: | > Hi all, | > | > switching from emacs 21 to emacs 22 has a very significant performance | > impact on packages that make heavy use of search_buffer. An example | > that actually made me aware of this problem is gnus processing large | > mbox files. Further analysis of this problem revealed that in emacs 22 | > an "i" in the search string makes search_buffer use simple_search() | > instead of boyer_moore(). | | Emacs 22's EQUIVALENCES table relates i, and thus I as well, to two | more characters with character codes 331857 and 331856. On | www.unicode.org the character look up engine couldn't find a match for | U+51051 or U+51050 saying that most likely those codes weren't | assigned to any characters yet. | | So, here is a plain question: Is there a bug in the case-table in | emacs 22 or does the search engine on www.unicode.org for some reason | miss certain character ranges? Slightly biassed, I'm disregarding the | possibility of me being unable to use www.unicode.org properly, which, | in fact, might well be the reason for my confusion. | | Second question: If the case-table was right, what would be the right | way to tacle the problem described in my original post? For me the | following snippet in .emacs solves the problem: | --- ~/.emacs --- | (unless (< emacs-major-version 22) | (set-case-syntax 331856 "w" (standard-case-table)) | (set-case-syntax 331857 "w" (standard-case-table))) | --- ~/.emacs --- | | This, of course, is a durty hack and I'm wondering whether emacs | should provide a feature to "clean up" the EQUIVALENCES table in the | ascii range in order to avoid falling back to a slow search | algorithm when we are searching for pure ascii strings. Or do you | think that packages like gnus which make heavy use of | re-search-forward should handle these performance issues | themselves---or indeed the users. `---- Alexandre, could you please try if the hack suggested by Elias makes your problem go away? Richard proposed a fix for this, but AFAICS, this has not been implemented: ,----[ http://thread.gmane.org/gmane.emacs.devel/53901/focus=54025 ] | From: Richard Stallman <rms <at> gnu.org> | Subject: Re: New buffer-case-table makes search_buffer painfully slow | Newsgroups: gmane.emacs.devel | Date: 2006-05-07 05:01:27 GMT | | I think this has to do with the special characters for Turkish, | lower-case i without dot and upper-case I with dot. In Turkish, | upcasing and downcasing preserve the dot, or the absence of the dot. | | I think these lines in characters.el are the cause of the problem. | | (set-downcase-syntax ?? ?i tbl) | (set-upcase-syntax ?I ?? tbl) | | They set up only half of what Turkish needs. | They make dotless-i upcase into I, and they make | I-with-dot downcase into i. They can't do vice versa | because that would break things for other languages. | So they are not really useful. We could simply delete them. | | We could also add a minor mode to set up the case table all the way | for Turkish. | | Would someone like to do that? `---- Looking at the ChangeLog, it seems that the relevant code in `characters.el' ... ,----[ international/characters.el ] | ;; In some languages, U+0049 LATIN CAPITAL LETTER I and U+0131 LATIN | ;; SMALL LETTER DOTLESS I make a case pair, and so do U+0130 LATIN | ;; CAPITAL LETTER I WITH DOT ABOVE and U+0069 LATIN SMALL LETTER I. | ;; Thus we have to check language-environment to handle casing | ;; correctly. Currently only I<->i is available. | [...] | (set-downcase-syntax ?İ ?i tbl) | (set-upcase-syntax ?I ?ı tbl) `---- ... has been changed back and forth several times: ,----[ ChangeLog ] | 2005-04-01 Kenichi Handa <[EMAIL PROTECTED]> | | * international/characters.el: Enable the correct case setting for | dotless-i and dotted-I. | | 2005-02-02 Kenichi Handa <[EMAIL PROTECTED]> | | * international/characters.el: Cancel previous change for | I-WITH-DOT-ABOVE and DOTLESS-i. | | 2005-02-02 Kenichi Handa <[EMAIL PROTECTED]> | | * international/latin-5.el (tbl): Setup cases of I-WITH-DOT-ABOVE, | DOTLESS-i. | | * international/characters.el: Setup cases of GREEK-FINAL-SIGMA, | Y-WITH-DIAERESIS, I-WITH-DOT-ABOVE, DOTLESS-i. `---- Bye, Reiner. -- ,,, (o o) ---ooO-(_)-Ooo--- | PGP key available | http://rsteib.home.pages.de/ _______________________________________________ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug