13-May-2013 16:01, Andrei Alexandrescu пишет:
Watch, discuss, vote up!
http://reddit.com/r/programming/comments/1e8mwq/dconf_2013_day_1_talk_3_distributed_caching/
Nice one! Reminds me the gory days of our "compiler technologies" course :)
Our professor seemed obsessed with tabulated methods as well (Dragon
book etc.) And above all - reverse polish notation as the one and holy
AST representation. Don't you try to say recursive descent parser in the
class...
Speaking of flat AST point in the talk - reverse-polish notation is
kind of neat in that it does represent flat AST and is easy to walk.
So one need not to have 2-array AST (childs + map to where are these).
Another advantage - you can slice off a sub-tree as simple as a chunk of
an array! It should be good for CPU caches (all related stuff is in one
block, no pointer chasing), but I never was close enough to a real
compiler to run any meaningful benchmarks.
Let's see the idea of AST as polish-notation array applied to the tree
in the talk. I show the forward one as it easier to read and less tricky
to walk.
Using the following notation for each node:
<arity>, array of offsets * arity, type code+contents | subtree(s)
Data is laid out byte-by-byte in raw-binary.
Offsets are in bytes starting right after type+contents.
So e.g. leaf node of int would be:
0, int
No offsets nor sub-trees, content/type tag etc. is marked as int
Then whole tree (tried to align for reading):
1, 0, S | 1, 0, DeclDefs |
3, 0, <x>, <y>, Declarator |
1, 0, BasicType | 0, int
1, 0, Identifier | 0, main
1, 0, DeclDefs | 1, 0, DeclDef | 1, 0 ReturnStatement | 0, "1"
<x> and <y> are sizes of sub-trees and can easily be calculated at
construction. Insertion is simple array insertion + fix ups of offsets
up the tree (you have to know your path in nodes anyway to back up).
Giving up on random access to child nodes would allow us to drop offset
table completely. I have limited experience on semantic analysis to tell
if it makes sense to require or drop it.
Will comment on the DFA & Unicode some time later, it's a neat topic in
its own right.
--
Dmitry Olshansky