I have succeeded in de-coupling the first part of the parser engine
Felix uses into a universal, stand-alone, dynamically loading parser.

The model is basically like this: 

Dypgen's library and Ocs Scheme are combined with a fixed
bootstrap grammar, which allows a suitable lexbuf to generate
grammar specifications from a user EBNF grammar. By preserving
the result the parser can be called again to translate a program
written in this grammar. Alternatively the new grammar can be
enabled on the fly and translation occur immediately.

A special data structure called "sex" is provided as well
with some functions to translate from Ocs-scheme s-expressions
into Sex s-expressions, which are easier to pattern match.

At this point a function to translate to text and parse back
from sex format is provided. An XML driver could be written
as well (though isn't available at present).

With this library linked into your Ocaml program you do not
need to invoke the Dypgen tool. The grammar processing
is done dynamically.

The bootstrap contains facilities to load and store both the
parser automaton and the result of parsing a target file
on disk. In practice this means the automaton is only built
once, and a target file only needs to be re-parsed when it
is changed.

Facilities to automatically manage this caching and drive
the parsing library from files are available in Felix and
have not yet been de-coupled. The easiest way to get hold
of this technology is to download and build Felix,
then just copy the relevant files out of the source and build.

        http://felix-lang.org/download.html


The primary constraint on the core parsing systems are:

(a) you must use the provided source reference data type.
It provides filename, start and end line and column.
Of course you can translate this as desired.

(b) You are currently stuck with the hard coded bootstrap
language which is an EBNF like language with two
statements:

        syntax name { grammar here } 
        ..
        open syntax name;

At present C and C++ style comments can be used. C comments
can be nested. There is no facility for defining comment style
at the moment (I plan to fix that).

(c) Your language must support top level statements
and expressions. The statements are needed to parse
and embed syntax extensions. Any other non-terminals
can be introduced when defining new productions for
one of these.

(d) C style #include is not supported. You must pre-process
files which include other files in a way that will change
parsing. This is a difficulty with Dypgen itself: there is no
way to maintain a stack of lexbufs, and it is impossible
in practice to recursively call a dypgen parser from within
Dypgen itself.

In any *sane* language -- and this clearly excludes C and C++ --
parsing should be invariant up to inline grammar modifications
or packaged grammars.

In C and C++ maintenance of a symbol table is required to recognise
type names. This makes the parsing of a file dependent on 
foreign files (#include files).  [Dypgen and the Felix parser of course
allow maintenance of such symbol tables although some care is
needed in case a modification is made by a proposed production
is given up].


--
john skaller
skal...@users.sourceforge.net
http://felix-lang.org




------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Felix-language mailing list
Felix-language@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/felix-language

Reply via email to