Ori.livneh has uploaded a new change for review.

  https://gerrit.wikimedia.org/r/49985


Change subject: (Bug 10621) Don't highlight symbols if >10k exist
......................................................................

(Bug 10621) Don't highlight symbols if >10k exist

Enabling full syntax highlighting for very long Lua modules can produce DOMs
that have hundreds of thousands of elements and cause browsers to lock up.

I took a count of spans by class (which amounts to a count of tokens by type)
of https://en.wiktionary.org/wiki/Module:languages and came up with:

       sy0:      62545 (symbols)
       br0:      61952 (brackets)
       st0:      39291 (strings)
       kw3:       7746 (keywords)
       kw1:          3
       kw2:          2
       co2:          2
       co1:          2
       nu0:          1
     ------     ------
     Total:     171544

GeSHi allows you to disable highlighting for a particular token type (see
<http://qbnz.com/highlighter/geshi-doc.html#disabling-lexics>) which like a
good way of handling this issue.

Disabling symbols (set_symbols_highlighting(false)) removes both sy0 and br0
elements from the DOM (about 124k elements in the case of Module:languages),
with about 47k elements remaining on the page. This is enough to make Chromium
responsive on my laptop (2.3ghz i5, 8 GB RAM), but it's still noticeably
sluggish. Adding 'set_string_highlighting(false);' removes another 40k elements
from the rendered output, and the resulting DOM is quite zippy at 8k elements.

Because Module:languages is an extreme case, I left syntax highlighting for
strings enabled. As for symbols, I adopted the (admittedly crude) heuristic of
disabling it if the number of punctuation marks in the code is greater than
10,000.

Change-Id: I90c645f9d03bbdc135058a3717a463dec40aa77d
---
M SyntaxHighlight_GeSHi.class.php
1 file changed, 6 insertions(+), 0 deletions(-)


  git pull 
ssh://gerrit.wikimedia.org:29418/mediawiki/extensions/SyntaxHighlight_GeSHi 
refs/changes/85/49985/1

diff --git a/SyntaxHighlight_GeSHi.class.php b/SyntaxHighlight_GeSHi.class.php
index 632c85d..4b7a711 100644
--- a/SyntaxHighlight_GeSHi.class.php
+++ b/SyntaxHighlight_GeSHi.class.php
@@ -276,6 +276,12 @@
                $geshi->enable_classes();
                $geshi->set_overall_class( "source-$lang" );
                $geshi->enable_keyword_links( false );
+
+               // Disable symbol highlighting if the code looks like it might
+               // contain a lot of symbols.
+               if ( preg_match_all( "/[[:punct:]]/", $text, $matches ) > 10000 
) {
+                       $geshi->set_symbols_highlighting( false );
+               }
                return $geshi;
        }
 

-- 
To view, visit https://gerrit.wikimedia.org/r/49985
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I90c645f9d03bbdc135058a3717a463dec40aa77d
Gerrit-PatchSet: 1
Gerrit-Project: mediawiki/extensions/SyntaxHighlight_GeSHi
Gerrit-Branch: master
Gerrit-Owner: Ori.livneh <o...@wikimedia.org>

_______________________________________________
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to