Sergey Gromov wrote:
Sat, 15 Aug 2009 01:36:26 +0100, Stewart Gordon wrote:

Sergey Gromov wrote:
"foo
bar"
So there is a problem if the highlighter works by matching regexps on a line-by-line basis. But matching regexps over a whole file is no harder in principle than matching line-by-line and, when the maximal munch principle is never called to action, it can't be much less efficient. (The only bit of C or D strings that relies on maximal munch is octal escapes.)

Highlighting the whole file every time a charater is typed is slow.
Scintilla doesn't do that.  It provides the lexer with a range of
changed lines.  The lexer is then free to choose a larger range if it
cannot deduce context from the initial range.  I tried to ignore this
range and re-highlight the whole file in my lexer.  The performance was
unacceptable.

Of course. I suppose now that the right strategy is line-by-line with some preservation of state between lines:

- Keep a note of the state at the beginning of each line
- When something is changed, re-highlight those lines that have changed
- Carry on re-highlighting until the state is back in sync with what was there before. If this means going way beyond the visible area of the file, record the state of the next however many lines as unknown (so that it will have another go when/if those lines are later scrolled into view). - If a range of lines that has just come into view begins in unknown state, it's up to the particular lexer module to start from the first visible line or backtrack as far as it likes to get some context.

Is this anything like how Scintilla works?

<snip>
It's actually trivial* to implement a lexer for Scintilla which would
work exactly as TextPad does, including use of the same configuration
files.

* That is, if you know exactly how TextPad works.

It would also be straightforward to improve TextPad's scheme to support an arbitrary number of string/comment types. How about this as an all-in-one replacement for TP's comment and string syntax directives?

[DelimitedToken1]
Start = /**
End = */
Type = DocComment
SpanLines = Yes
Nest = No

[DelimitedToken2]
Start = /*!
End = */
Type = DocComment
SpanLines = Yes
Nest = No

[DelimitedToken3]
Start = /*
End = */
Type = Comment
SpanLines = Yes
Nest = No

[DelimitedToken4]
Start = /+
End = +/
Type = Comment
SpanLines = Yes
Nest = Yes

[DelimitedToken5]
Start = //
Type = Comment
SpanLines = No
Nest = No

[DelimitedToken6]
Start = r"
End = "
Type = String
SpanLines = Yes
Nest = No

[DelimitedToken7]
Start = `
End = `
Type = String
SpanLines = Yes
Nest = No

[DelimitedToken8]
Start = "
End = "
Esc = \
Type = String
SpanLines = Yes
Nest = No

[DelimitedToken9]
Start = '
End = '
Esc = \
Type = Char
SpanLines = No
Nest = No

There, we have all of D1 covered now, and not a regexp in sight.

<snip>
Basically yes, but they're going to be much more complex.  3Lu...5 is
also a range.  0x3e22.f5p6fi is a valid floating-point number.  And
still, regexps don't nest.  Don't you want to highlight DDoc sections
and macros?

That would be nice as well, as would being able to do things with Doxygen comments. But let's not try to run before we can walk.

Stewart.

Reply via email to