On Saturday, 7 July 2012 at 22:03:20 UTC, Chad J wrote:
enum : SyntaxElement
{
  AST_EXPRESSION          = 0x0001_0000_0000_0000,
    AST_UNARY_EXPR        = 0x0000_0001_0000_0000 |

This would cause wasting space (probably a lot). Definitely it would not be easy to implement in a parser generator, when various language properties are not known beforehand for fine-grained tuning.

This approach of course has shameful nesting limitations, but I feel like these determinations could be fairly well optimized even for the general case. For example: another approach that I might be more inclined to take is to give each token/symbol a low-valued index into a small inheritance table.

Depending on implementation, that might introduce the multiplier overhead of table access per each comparison (and there would be many in case of searching for nodes of specific type).

I would expect the regex engine to call the isA function as one of it's operations. Thus placing an AST_EXPRESSION into your expression would also match an AST_NEGATE_EXPR too.

But actually it is not so difficult to implement in a very similar way to what you described. I was thinking about a lookup table, but different from a traditional inheritance table. It would be indexed by AST node type (integral enum value), and store various classification information as bits. Maybe this is what you meant and I misunderstood you... Example is here: https://github.com/roman-d-boiko/dct/blob/May2012drafts/fe/core.d (sorry, it doesn't show how to do classification, and has a different context, but I hope you get the idea). The advantage over storing hierarchical information directly in each token is obviously memory usage.

Reply via email to