> On Sun, Apr 17, 2005 at 04:33:57PM +0200, BÁRTHÁZI András wrote: > > Just a short question I'm interested in: where will be, and how will > > work (I just asking for a general description about it) the regular > > expression / rules part of Parrot?
On Sun, 2005-04-17 at 09:38 -0500, Patrick R. Michaud wrote: > The regular expression / rules part of Parrot is called "PGE", > for "Perl/Parrot Grammar Engine", and it's currently in the compilers/pge > directory. The intent is that rules will be another compiler within > Parrot (i.e., it can standalone somewhat outside of Perl). At the risk of advertising vaporware, I'd like to grab this opportunity to provide an update on my own work. Those of you with good memory may recall that I announced a preliminary "Parrot Syntax Engine" on this list back in the beginning of January. In essence, this gave the API and a basic implementation of a bottom-up GLR parser for dynamic grammars. As such, it has some bearing on the "rules" part of Parrot, although the interface is more generally intended for domain-specific languages where the language syntax is allowed to change at runtime. To make a short story short, Leo went directly for my throat by asking me about performance, which I had to admit was less than satisfactory. Since then, I've spent quite some time modifying the DParser layer to allow incremental grammar updates. The time needed to add a 340th rule to the grammar is now about 1/1000 of the original result, or < 1 ms. Leo also asked the following, which I never really answered: On Fri, 2005-01-07 at 11:10, Leopold Toetsch wrote: > - How fast is DParser compared to bison/flex? > - What about memory usage compared to bison/flex? I have not tested the memory consumption for a fixed grammar, but the total memory used in building the 339-rule Python grammar was roughly 3.5 MB according to valgrind/massif. That's about 4 KB per LR state, which should not cause a big problem in the foreseeable future. Note that a lot of this memory can be released if the grammar is frozen! In order to test the parsing speed, I extracted the C grammar of gcc and constructed a yacc/lex and a PSE version of the parser. The idea was to run both parsers on files from the Linux kernel to get a real-world test of correctness and speed. As expected, yacc turned out to be faster :-) Unfortunately, the difference was almost two orders of magnitude, with DParser taking more than half a second to parse a 2000-line file. I am not completely happy with this result, so that is my current focus... >From time to time, life interferes with progress, and I am not at this point ready to give an expected release date for version 0.2, mostly because the mysterious factor 3.14 has been creeping into all estimates I've attempted so far. I just wanted to give the information I have, in case someone wondered what happened to this little project :-| /Henrik