[Ur] Calling all Emacs wizards

Adam Chlipala Sun, 07 Aug 2011 08:18:03 -0700

The urweb-mode for Emacs has some very slow syntax highlighting, to thepoint of being a real hindrance to the development of non-trivialprojects. I know exactly which code is to blame, but the harderquestion is how the same goal may be accomplished more efficiently. Inthe past, I've sent pleas to this list, asking for help on the issue,with no response. I'm going to try again, and this time I'm able togive more information on the problem.

The issue comes only from detecting text that is literal XML CDATA; thatis, normal text that, in the case of HTML, should be passed on directlyto the user. I built urweb-mode by modifying sml-mode. I presumesml-mode is doing syntax highlighting in a standard way, but, in anycase, it's based on regular expressions identifying spans of text thatshould have particular Emacs font faces associated with them.

The crux of the problem, then, is that, in Ur/Web, being XML CDATA is acontext-free property, but not a regular property (in the sense ofregular languages and regular expressions). An XML sequence appearswithin <xml>...</xml> brackets, and within there may be "antiquoted" Ursequences appearing within {...} brackets, within which there may befurther XML, and so on, up to unbounded depth.

My current urweb-mode code uses a regular expression to identify maximalsegments of text that could possibly be CDATA. Then, a custom Elispfunction is called to search backward from that point, counting open andclose brackets to figure out whether we are in XML. This search processmay proceed arbitrarily far back in the buffer, and the process isrepeated for each sequence of CDATA between tags/antiquotes. That canbe a lot of different calls to this not-particularly-efficient recursivefunction, with no reuse of results!

I've tried to bumble my way through Emacs mode authorship withoutsitting down to learn Elisp properly, and I'm hoping to stay on thatpath! Would any Emacs wizards do us the favor of reworking this part ofthe code to improve the efficiency? For instance, it wouldn't surpriseme if there is an easy way to examine formatting already set on sometext segments to speed up the decision for later segments.

All the relevant source code is in urweb/src/elisp/urweb-mode.el.Function 'urweb-in-xml' is where I hypothesize most time is spent. It'scalled from one of the actions in 'urweb-font-lock-keywords'.


Thanks in advance to anyone who can help fix this long-standing problem!

_______________________________________________
Ur mailing list
[email protected]
http://www.impredicative.com/cgi-bin/mailman/listinfo/ur

[Ur] Calling all Emacs wizards

Reply via email to