From: "Neil Hodgson" <[EMAIL PROTECTED]>
Armel Asselin:

My idea is that lexing states could be stored (approximately) each N
bytes/characters and lexing could be only pure-dynamic.

  It may be an interesting idea to explore. Cutting every N
characters leads to more states since, for example, N may split the
'/' and '*' in a C comment start. The current approach which only
lexes whole lines allows lexers handle these short sequences easily.
The lexer could report the last simple resynchronization point before
N thus shortening the segment.
that's a bit why i talk about _approximately_ N (so the distance between successive lexing states could be in [N/2; N*2[ for example), for N large enough [say around 1000]. it can be useful for what you explain, as well as for insertion/deletion, so that a way of keeping parsed stuff _after_ last operation point could be designed by some notion allowing to reparse from last operation point and compare to the following state previously stored. If equal, full reparsing is avoided which could help with large files. The technic could be extended to potentially re-synchronize to any later position with a matching lexing state. I think of an array of (distance-to-prev;lex-state) entries, indexed by cumulated positions per packet of K entries or an array of (abs-position;lex-state) directly indexed on abs-position field.

Armel


_______________________________________________
Scintilla-interest mailing list
[email protected]
http://mailman.lyra.org/mailman/listinfo/scintilla-interest

Reply via email to