hi,
I am a grad student studying at the University of Illinois at Chicago.
I wanted to contribute to Bison through the Google Summer of Code
program and I prepared this proposal.
But somehow in the last minute my university tells me I am not
eligible to participate in that program (and get paid) because I do
not have a US work authorization yet.
I decided to go ahead and contribute independently. That way, even
though I don't get paid, I will (hopefully) learn something. So if
there isn't anyone contributing to the same project through google
summer of code, please consider mentoring me. My aim is to gain a
thorough knowledge of bison internals and practical issues in
implementation of parsing algorithms.
Project title: Implementation of Burke-Fisher Error correction
implementation for bison C output.
pls see attached proposal for details.
Thanks!
Satya.
http://satyakiran.googlepages.com/google_soc.txt
Google Summer of Code Application
Mentor - GNU
CONTENTS
1. NAME / CONTACT INFORMATION
2. PROJECT TITILE
3. PROBLEM DESCRIPTION / SUGGESTED SOLUTION / DELIVERABLES
4. PLAN FOR THE PROJECT
5. QUALIFICATION
5. RESUME
6. REFERENCES
*********************** NAME / CONTACT INFORMATION
************************************************************
Satya Kiran Popuri
1062 W Taylor st Apt 2
Chicago IL 60607 USA.
Phone:
773-780-1101 ( Cell )
*********************** PRJECT TITLE
**************************************************************************
GNU Bison: Burke-Fisher Error Correction
*********************** PROBLEM DESCRIPTION /SUGGESTED SOLUTION
***********************************************
To implement Burke-Fisher Error Correction into Bison C output.
Summary:
YACC traditionally implements error correction with error productions and the
inbuilt 'error' token. This requires the grammar writer to clutter her grammar
with possible error productions. The burden of strategic error recovery is put
on the grammar writer.
A global error recovery algorithm like Burke-Fischer error repair can handle
errors without any explicit error productions in the grammar. The algorithm
looks for a single token insertion/deletion/replacement to the left of (and
including) the erroneous token and tries to parse through the error token
successfully. This strategy is based on the fact that an incorrect token can
occur in the input string much before the token where the error is actually
discovered. The algorithm is described in the paper titled "A Practical Method
for LR and LL Syntactic Error Diagnosis and Recovery" by Michael G. Burke and
Gerald A. Fisher. The full text of this paper is available from:
http://www.cs.berkeley.edu/~jcondit/pl-prelim/burke87practical.pdf
Solution:
An implementation of the Burke-Fisher Error Correction algorithm requires two
simultaneous parsers to be generated in the bison output. One parser performs
shift/reduce actions on the "current" stack and a second "old" parser follows
the same actions k tokens behind the current parser. When an error is
encountered, the current parser can call the old parser to try all possible
single token replacements starting k positions to the left of the current
token. These k tokens are stored in a steady queue. When the old parser is able
to parse R=4 tokens ahead, with one of the single token
insertions/deletions/replacements, we can consider the error to be recovered
successfully.
ML-YACC, a parser generator written in ML programming language already offers
this facility. An ML implementation of the above algorithm is available online
at: http://www.cs.cmu.edu/afs/cs/user/fp/courses/95-lp/code/elpsml/base.sml
Semantic Actions:
Since bison is often used to generate parsers for many different languages, we
cannot guarantee side-affect free semantic actions. Hence the semantic actions
of the grammar productions reduced by the "current" parser must be deferred
until the "old" parser has parsed successfully. All semantic actions are
carried out on the "old stack". The role of the "current" parser is just to
check syntax.
Additional Bison declarations for error recovery:
When token insertions are tried, we need some semantic value for the inserted
token. These values can be supplied by adding a new %value directive to bison.
Also, the grammar writer will be able to suggest some error corrections for
token replacement by using the %change declaration as implemented in ML-YACC.
Deliverables:
1. The changed bison code. Particularly the parser generator part of the
code will change a lot. Also minor changes to the front end to check for a new
command line option for Burke-Fisher error recovery mode.
2. Documentation of all changes made.
*********************** PROJECT SCHEDULE
***********************************************
1. May 23 - June 16 Understand Bison code base and do a low level
design document of the code changes to be made.
2. June 19- July 21 Make required code changes in bison and do unit
tests. Also prepare regression and system test cases.
3. Jul 24 - Aug 13 System testing and regression testing with
existing bison error mechanism and also the new mechanism implemented.
4. Aug 14 - Aug 21 Documentation - Documentation will be carried
out with the development phase. This final stage is only to refine the
documentation.
*********************** QUALIFICATION
***********************************************
I studied compiler design during spring 2006 at UIC. LR parsing theory was a
large part of the curriculum. I used Bison to generate a parser and abstract
syntax tree for a small subset of C as part of the course (see
http://satyakiran.googlepages.com/cminus for details). But this was only a toy
compiler. Now I want to explore real world problems. I think a compiler
compiler will enable me to understand things more deeply than just a compiler.
Another important reason for taking up this project is that I always wanted to
contribute to free software, but never got a chance. I thought now is a good
time and opportunity to dive into the world of free software.
*********************** RESUME ***********************************************
Please visit http://satyakiran.googlepages.com/hresume.html to view my resume.
Thanks.
*********************** REFERENCES
***********************************************
1. Prof V.N. VenkataKrishnan, Dept. of Computer Science, University of Illinois
a Chicago, Chicago, IL. Ph: [EMAIL PROTECTED]
_______________________________________________
help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison