Hi all,
For info. This is regarding SF bug #1515556, on paired delimiter
behaviour for the Bash lexer. I checked the behaviour of Bash 3.2
on Cygwin last month; this is a write-up. I don't plan to
implement this, because it looks like a lot of complex work for a
minor win, so this is for any interested implementor to study.
Besides, exact behaviour might change in the future.
In bash, it turns out that delimiter recognition and processing
changes depending on how they are nested. For example, in the SF
bug example:
ONEPKG="^\"`echo -n "$ONELINE" | cut -f 2 -d '"'`\""
In the above, a "" string can have unescaped " characters because
those unescaped " characters are within a `` command string and a
'' string. While the problem is not insurmountable, lexing exact
behaviour cannot be accomplished by totally separate delimiter
pair processing. I checked nesting behaviour for the following
delimiter pairs: '' "" $() `` ${} ()
For '':
=======
- ended by '
- some \ escapes
- no nesting
For "":
=======
- ended by "
- some \ escapes
- does not check for '' pairs
- embedded $() follows $() rules
- embedded `` follows `` rules
- embedded ${} follows ${} rules
- does not check for () pairs
For $():
========
- ended by )
- embedded '' follows '' rules
- embedded "" follows "" rules
- embedded $() follows $() rules
- embedded `` follows `` rules
- embedded ${} follows ${} rules
- checks for balanced () pairs
For ``:
=======
- ended by `
- embedded '' follows '' rules
- embedded "" follows "" rules
- embedded $() follows $() rules
- ` is never nested; a ` always ends the segment
- embedded ${} follows ${} rules
- checks for balanced () pairs
For ${}:
========
- ended by }
- other delimiter pairs can be embedded and nested much like $(),
but once past parsing, bash flags most cases as illegal code
For ():
=======
- ended by )
- not really a real delimiter pair; but $() for example needs
parentheses to be balanced
- checks for balanced () pairs
Something like that. The implementation problem is, when other
delimiter pairs are embedded, lexing behaviour changes, so some
sort of stack is needed, I suppose. For command strings, $() and
`` work a little differently too. Perhaps there is some elegant
way of lexing nested delimiters... but I can't think of anything I
want to code at the moment.
Looks like a heck of a job for a small win, so I will leave it to
someone more desperate for correctness. :-) I have not seen any
broken nested delimiter lexing in standard sh scripts; the only
sample I have seen is Enrico's SF bug report. So there.
HTH,
--
Cheers,
Kein-Hong Man (esq.)
Kuala Lumpur, Malaysia
_______________________________________________
Scintilla-interest mailing list
[email protected]
http://mailman.lyra.org/mailman/listinfo/scintilla-interest