Apparently, it is a known problem and the solution is to write the CFG differently... http://www.google.fr/url?sa=t&source=web&cd=7&ved=0CEsQFjAG&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.3.9583%26rep%3Drep1%26type%3Dpdf&ei=uLmUTqzQMcfrsgaZv6C-BQ&usg=AFQjCNHI7EfNrHlqmZYMIXEDhxibfnrZcg&sig2=Zv7847-YG0CxbfnMbEc4Lw
On Tue, Oct 11, 2011 at 8:37 PM, Xavier MONTILLET <xavierm02....@gmail.com> wrote: > I guess I could try to add a loop that increments a variable > containing the number of time it should read the same rule while it > doesn't match the whole string and the number of characters matched > increases with the variable... > But it looks like going around the problem without really fixing it... > > On Tue, Oct 11, 2011 at 7:57 PM, Xavier MONTILLET > <xavierm02....@gmail.com> wrote: >> I just reformated the whole thing and it kind of works. >> http://jsfiddle.net/xavierm02/ndDVa/1/ >> I get a CST and I think I can manage to turn it into an AST. >> But I still have a problem with the CFG... As shown, it works >> >> But if I change the order of the things to match it fails. At last on >> recursive rules. >> Here is an example : >> >> Text >> Text Letter >> >> The code about won't work with my "parser" but if i rewrite it to >> avoid having the recursive rule first, it works: >> >> Text >> Letter Text >> >> The above example works... >> >> But I'm quite sure I won't always be able to avoid having a recursive >> rule as first rule so... Do you guys see anything I could do to avoid >> it? >> >> I tried allowing a given number of recursion before simply returning >> false when seeing a recursive rule but if you have more than that >> number, it fails :/ >> And it I don't do it at all, I get an infinite loop. >> >> Here's the exact same code except I changed the order in the >> declaration of MultiplicativeExpression or AdditiveExpression. >> http://jsfiddle.net/xavierm02/ndDVa/2/ >> >> On Sun, Oct 9, 2011 at 8:16 PM, Xavier MONTILLET >> <xavierm02....@gmail.com> wrote: >>> I figured out how to get something looking like an AST. >>> http://jsfiddle.net/xavierm02/HG6Bx/1/ >>> But comments and ideas on how to throw errors are still welcome. >>> >>> On Sun, Oct 9, 2011 at 6:10 PM, Xavier MONTILLET >>> <xavierm02....@gmail.com> wrote: >>>> And how do you throw errors with a CFG ? >>>> I mean JSLint does throw errors but the Lexer doesn't use a CFG... >>>> If you use a CFG, how do you throw errors ? You take the longest / >>>> deepest "possibility" that had a meaning ? >>>> >>>> On Sun, Oct 9, 2011 at 4:17 PM, Xavier MONTILLET >>>> <xavierm02....@gmail.com> wrote: >>>>> Btw, I have no knowledge in Computer Science at all. So if you know >>>>> tutorials / books explaining how to build a parser without needed too >>>>> much knowledge, please tell me. Because every single tutorial I tried >>>>> to read started with strange formulas I couldn't understand and, >>>>> instead of providing pseudo-code, just described what the code should >>>>> do, gave mathematical methods which I couldn't understand to check >>>>> whether the result was going to be what we wanted or not and so on... >>>>> >>>>> (I know JavaScript quite well because I started doing websites in >>>>> high-school but since I left High-school (last year), I've only been >>>>> doing Maths and Physics in "classes preparatoire" (a France-specific >>>>> system where you do a lot of Maths / Physics for two years in order to >>>>> prepare "contests" which, depending on what your ranking is, will >>>>> allow you to choose a given school or not. And I won't be doing any >>>>> computer science until then...) >>>>> >>>>> On Sun, Oct 9, 2011 at 4:05 PM, Xavier MONTILLET >>>>> <xavierm02....@gmail.com> wrote: >>>>>> I've started writing the Lexer and for it I used some kind of CFG. >>>>>> But I'm not sure whether I understood the aim of those things >>>>>> properly. I mean is the CFG supposed to let you build a syntax tree or >>>>>> just a token list ? Because I can't figure out how to build the syntax >>>>>> tree from it... >>>>>> Anyway, I wrote a CFG for basic Maths : >>>>>> >>>>>> Digit >>>>>> '0' >>>>>> '1' >>>>>> '2' >>>>>> '3' >>>>>> '4' >>>>>> '5' >>>>>> '6' >>>>>> '7' >>>>>> '8' >>>>>> '9' >>>>>> >>>>>> Operator >>>>>> '+' >>>>>> '-' >>>>>> '*' >>>>>> '/' >>>>>> >>>>>> Expression >>>>>> '(' Expression ')' >>>>>> Expression Operator Expression >>>>>> Number >>>>>> >>>>>> And as I didn't want to bother writing a parser to parse the CFG for >>>>>> my parser, I transformed it in JS : >>>>>> >>>>>> var rules = ( function ( ) { >>>>>> var Digit = [ ]; >>>>>> var Expression = [ ]; >>>>>> var Operator = [ ]; >>>>>> Digit.name = 'Digit'; >>>>>> Expression.name = 'Expression'; >>>>>> Operator.name = 'Operator'; >>>>>> Digit.push( >>>>>> [ '0' ], >>>>>> [ '1' ], >>>>>> [ '2' ], >>>>>> [ '3' ], >>>>>> [ '4' ], >>>>>> [ '5' ], >>>>>> [ '6' ], >>>>>> [ '7' ], >>>>>> [ '8' ], >>>>>> [ '9' ] >>>>>> ); >>>>>> Expression.push( >>>>>> [ '(', Expression, ')' ], >>>>>> [ Expression, Operator, Expression ], >>>>>> [ Digit ] >>>>>> ); >>>>>> Operator.push( >>>>>> [ '+' ], >>>>>> [ '-' ], >>>>>> [ '*' ], >>>>>> [ '/' ] >>>>>> ); >>>>>> return { >>>>>> Digit: Digit, >>>>>> Expression: Expression, >>>>>> Operator: Operator >>>>>> }; >>>>>> } )( ); >>>>>> >>>>>> The name properties are here because I had some infinite loops so it >>>>>> was easier to debug with it... >>>>>> And I define the arrays before so that I can reference them in others. >>>>>> >>>>>> Then, with this, I built a lexer: >>>>>> >>>>>> http://jsfiddle.net/xavierm02/tUCXY/ >>>>>> >>>>>> Is it good or not ? >>>>>> Any comment would be welcome. The only thing I'm planning on changing >>>>>> is the j += newTokens.join( '' ).length; I probably need it because I >>>>>> didn't store indexes properly... Or maybe because I use readTokens >>>>>> that calls itself instead of a loop. >>>>>> >>>>>> Thanks in advance for your comments. >>>>>> >>>>>> On Mon, Oct 3, 2011 at 9:11 PM, Xavier MONTILLET >>>>>> <xavierm02....@gmail.com> wrote: >>>>>>> Ok thank you :) >>>>>>> >>>>>>> On Sat, Oct 1, 2011 at 4:31 PM, Lasse Reichstein >>>>>>> <reichsteinatw...@gmail.com> wrote: >>>>>>>> There is nothing in the complexity of '+' that mandates your Abstract >>>>>>>> Syntax >>>>>>>> Tree layout. It's still pure syntax at this point. >>>>>>>> However, any use of the parsed syntax must behave as if the (+ a b c) >>>>>>>> syntax >>>>>>>> tree is equivalent to (+ (+ a b) c), so it won't buy you much, and >>>>>>>> probably >>>>>>>> just makes using the parsed syntax more complex. >>>>>>>> If repeated use of the same operator happens a lot, and you can easily >>>>>>>> handle the pairing later, by all means do make a single vairable-sized >>>>>>>> node. >>>>>>>> E.g., in the RegExp grammar, adjacent literal characters are really a >>>>>>>> sequence of Alternatives of Terms (of Atoms), but it happens so often >>>>>>>> that >>>>>>>> you probably want to parse /abel.*/ as containing a single text-node >>>>>>>> with >>>>>>>> the text "abel". >>>>>>>> Just don't optimize the abstract syntax without considering how it's >>>>>>>> going >>>>>>>> to be used. >>>>>>>> /L >>>>>>>> >>>>>>>> On Sat, Oct 1, 2011 at 12:28 PM, Xavier MONTILLET >>>>>>>> <xavierm02....@gmail.com> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Right. I totally forgot that... >>>>>>>>> So I have to keep only two operands per operator. >>>>>>>>> Thank you for answering :) >>>>>>>>> >>>>>>>>> On Sat, Oct 1, 2011 at 12:48 AM, Poetro <poe...@gmail.com> wrote: >>>>>>>>> > The + operator is not just for numbers and can have sideeffects, if >>>>>>>>> > the operands have different types or the operands are not numbers or >>>>>>>>> > strings (and even in that case). This makes the + operator tricky. >>>>>>>>> > >>>>>>>>> >>>> "boo" + 1 + 2 >>>>>>>>> > "boo12" >>>>>>>>> >>>> 1 + 2 + "boo" >>>>>>>>> > "3boo" >>>>>>>>> > >>>>>>>>> >>>> var a = {toString: function () { return 1; }, valueOf: function >>>>>>>>> >>>> () { >>>>>>>>> >>>> return 2; }}, b = 0; a + b >>>>>>>>> > 2 >>>>>>>>> > >>>>>>>>> > -- >>>>>>>>> > Poetro >>>>>>>>> > >>>>>>>>> > -- >>>>>>>>> > To view archived discussions from the original JSMentors Mailman >>>>>>>>> > list: >>>>>>>>> > http://www.mail-archive.com/jsmentors@jsmentors.com/ >>>>>>>>> > >>>>>>>>> > To search via a non-Google archive, visit here: >>>>>>>>> > http://www.mail-archive.com/jsmentors@googlegroups.com/ >>>>>>>>> > >>>>>>>>> > To unsubscribe from this group, send email to >>>>>>>>> > jsmentors+unsubscr...@googlegroups.com >>>>>>>>> > >>>>>>>>> >>>>>>>>> -- >>>>>>>>> To view archived discussions from the original JSMentors Mailman list: >>>>>>>>> http://www.mail-archive.com/jsmentors@jsmentors.com/ >>>>>>>>> >>>>>>>>> To search via a non-Google archive, visit here: >>>>>>>>> http://www.mail-archive.com/jsmentors@googlegroups.com/ >>>>>>>>> >>>>>>>>> To unsubscribe from this group, send email to >>>>>>>>> jsmentors+unsubscr...@googlegroups.com >>>>>>>> >>>>>>>> -- >>>>>>>> To view archived discussions from the original JSMentors Mailman list: >>>>>>>> http://www.mail-archive.com/jsmentors@jsmentors.com/ >>>>>>>> >>>>>>>> To search via a non-Google archive, visit here: >>>>>>>> http://www.mail-archive.com/jsmentors@googlegroups.com/ >>>>>>>> >>>>>>>> To unsubscribe from this group, send email to >>>>>>>> jsmentors+unsubscr...@googlegroups.com >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > -- To view archived discussions from the original JSMentors Mailman list: http://www.mail-archive.com/jsmentors@jsmentors.com/ To search via a non-Google archive, visit here: http://www.mail-archive.com/jsmentors@googlegroups.com/ To unsubscribe from this group, send email to jsmentors+unsubscr...@googlegroups.com