I just reformated the whole thing and it kind of works. http://jsfiddle.net/xavierm02/ndDVa/1/ I get a CST and I think I can manage to turn it into an AST. But I still have a problem with the CFG... As shown, it works
But if I change the order of the things to match it fails. At last on recursive rules. Here is an example : Text Text Letter The code about won't work with my "parser" but if i rewrite it to avoid having the recursive rule first, it works: Text Letter Text The above example works... But I'm quite sure I won't always be able to avoid having a recursive rule as first rule so... Do you guys see anything I could do to avoid it? I tried allowing a given number of recursion before simply returning false when seeing a recursive rule but if you have more than that number, it fails :/ And it I don't do it at all, I get an infinite loop. Here's the exact same code except I changed the order in the declaration of MultiplicativeExpression or AdditiveExpression. http://jsfiddle.net/xavierm02/ndDVa/2/ On Sun, Oct 9, 2011 at 8:16 PM, Xavier MONTILLET <xavierm02....@gmail.com> wrote: > I figured out how to get something looking like an AST. > http://jsfiddle.net/xavierm02/HG6Bx/1/ > But comments and ideas on how to throw errors are still welcome. > > On Sun, Oct 9, 2011 at 6:10 PM, Xavier MONTILLET > <xavierm02....@gmail.com> wrote: >> And how do you throw errors with a CFG ? >> I mean JSLint does throw errors but the Lexer doesn't use a CFG... >> If you use a CFG, how do you throw errors ? You take the longest / >> deepest "possibility" that had a meaning ? >> >> On Sun, Oct 9, 2011 at 4:17 PM, Xavier MONTILLET >> <xavierm02....@gmail.com> wrote: >>> Btw, I have no knowledge in Computer Science at all. So if you know >>> tutorials / books explaining how to build a parser without needed too >>> much knowledge, please tell me. Because every single tutorial I tried >>> to read started with strange formulas I couldn't understand and, >>> instead of providing pseudo-code, just described what the code should >>> do, gave mathematical methods which I couldn't understand to check >>> whether the result was going to be what we wanted or not and so on... >>> >>> (I know JavaScript quite well because I started doing websites in >>> high-school but since I left High-school (last year), I've only been >>> doing Maths and Physics in "classes preparatoire" (a France-specific >>> system where you do a lot of Maths / Physics for two years in order to >>> prepare "contests" which, depending on what your ranking is, will >>> allow you to choose a given school or not. And I won't be doing any >>> computer science until then...) >>> >>> On Sun, Oct 9, 2011 at 4:05 PM, Xavier MONTILLET >>> <xavierm02....@gmail.com> wrote: >>>> I've started writing the Lexer and for it I used some kind of CFG. >>>> But I'm not sure whether I understood the aim of those things >>>> properly. I mean is the CFG supposed to let you build a syntax tree or >>>> just a token list ? Because I can't figure out how to build the syntax >>>> tree from it... >>>> Anyway, I wrote a CFG for basic Maths : >>>> >>>> Digit >>>> '0' >>>> '1' >>>> '2' >>>> '3' >>>> '4' >>>> '5' >>>> '6' >>>> '7' >>>> '8' >>>> '9' >>>> >>>> Operator >>>> '+' >>>> '-' >>>> '*' >>>> '/' >>>> >>>> Expression >>>> '(' Expression ')' >>>> Expression Operator Expression >>>> Number >>>> >>>> And as I didn't want to bother writing a parser to parse the CFG for >>>> my parser, I transformed it in JS : >>>> >>>> var rules = ( function ( ) { >>>> var Digit = [ ]; >>>> var Expression = [ ]; >>>> var Operator = [ ]; >>>> Digit.name = 'Digit'; >>>> Expression.name = 'Expression'; >>>> Operator.name = 'Operator'; >>>> Digit.push( >>>> [ '0' ], >>>> [ '1' ], >>>> [ '2' ], >>>> [ '3' ], >>>> [ '4' ], >>>> [ '5' ], >>>> [ '6' ], >>>> [ '7' ], >>>> [ '8' ], >>>> [ '9' ] >>>> ); >>>> Expression.push( >>>> [ '(', Expression, ')' ], >>>> [ Expression, Operator, Expression ], >>>> [ Digit ] >>>> ); >>>> Operator.push( >>>> [ '+' ], >>>> [ '-' ], >>>> [ '*' ], >>>> [ '/' ] >>>> ); >>>> return { >>>> Digit: Digit, >>>> Expression: Expression, >>>> Operator: Operator >>>> }; >>>> } )( ); >>>> >>>> The name properties are here because I had some infinite loops so it >>>> was easier to debug with it... >>>> And I define the arrays before so that I can reference them in others. >>>> >>>> Then, with this, I built a lexer: >>>> >>>> http://jsfiddle.net/xavierm02/tUCXY/ >>>> >>>> Is it good or not ? >>>> Any comment would be welcome. The only thing I'm planning on changing >>>> is the j += newTokens.join( '' ).length; I probably need it because I >>>> didn't store indexes properly... Or maybe because I use readTokens >>>> that calls itself instead of a loop. >>>> >>>> Thanks in advance for your comments. >>>> >>>> On Mon, Oct 3, 2011 at 9:11 PM, Xavier MONTILLET >>>> <xavierm02....@gmail.com> wrote: >>>>> Ok thank you :) >>>>> >>>>> On Sat, Oct 1, 2011 at 4:31 PM, Lasse Reichstein >>>>> <reichsteinatw...@gmail.com> wrote: >>>>>> There is nothing in the complexity of '+' that mandates your Abstract >>>>>> Syntax >>>>>> Tree layout. It's still pure syntax at this point. >>>>>> However, any use of the parsed syntax must behave as if the (+ a b c) >>>>>> syntax >>>>>> tree is equivalent to (+ (+ a b) c), so it won't buy you much, and >>>>>> probably >>>>>> just makes using the parsed syntax more complex. >>>>>> If repeated use of the same operator happens a lot, and you can easily >>>>>> handle the pairing later, by all means do make a single vairable-sized >>>>>> node. >>>>>> E.g., in the RegExp grammar, adjacent literal characters are really a >>>>>> sequence of Alternatives of Terms (of Atoms), but it happens so often >>>>>> that >>>>>> you probably want to parse /abel.*/ as containing a single text-node with >>>>>> the text "abel". >>>>>> Just don't optimize the abstract syntax without considering how it's >>>>>> going >>>>>> to be used. >>>>>> /L >>>>>> >>>>>> On Sat, Oct 1, 2011 at 12:28 PM, Xavier MONTILLET >>>>>> <xavierm02....@gmail.com> >>>>>> wrote: >>>>>>> >>>>>>> Right. I totally forgot that... >>>>>>> So I have to keep only two operands per operator. >>>>>>> Thank you for answering :) >>>>>>> >>>>>>> On Sat, Oct 1, 2011 at 12:48 AM, Poetro <poe...@gmail.com> wrote: >>>>>>> > The + operator is not just for numbers and can have sideeffects, if >>>>>>> > the operands have different types or the operands are not numbers or >>>>>>> > strings (and even in that case). This makes the + operator tricky. >>>>>>> > >>>>>>> >>>> "boo" + 1 + 2 >>>>>>> > "boo12" >>>>>>> >>>> 1 + 2 + "boo" >>>>>>> > "3boo" >>>>>>> > >>>>>>> >>>> var a = {toString: function () { return 1; }, valueOf: function () >>>>>>> >>>> { >>>>>>> >>>> return 2; }}, b = 0; a + b >>>>>>> > 2 >>>>>>> > >>>>>>> > -- >>>>>>> > Poetro >>>>>>> > >>>>>>> > -- >>>>>>> > To view archived discussions from the original JSMentors Mailman list: >>>>>>> > http://www.mail-archive.com/jsmentors@jsmentors.com/ >>>>>>> > >>>>>>> > To search via a non-Google archive, visit here: >>>>>>> > http://www.mail-archive.com/jsmentors@googlegroups.com/ >>>>>>> > >>>>>>> > To unsubscribe from this group, send email to >>>>>>> > jsmentors+unsubscr...@googlegroups.com >>>>>>> > >>>>>>> >>>>>>> -- >>>>>>> To view archived discussions from the original JSMentors Mailman list: >>>>>>> http://www.mail-archive.com/jsmentors@jsmentors.com/ >>>>>>> >>>>>>> To search via a non-Google archive, visit here: >>>>>>> http://www.mail-archive.com/jsmentors@googlegroups.com/ >>>>>>> >>>>>>> To unsubscribe from this group, send email to >>>>>>> jsmentors+unsubscr...@googlegroups.com >>>>>> >>>>>> -- >>>>>> To view archived discussions from the original JSMentors Mailman list: >>>>>> http://www.mail-archive.com/jsmentors@jsmentors.com/ >>>>>> >>>>>> To search via a non-Google archive, visit here: >>>>>> http://www.mail-archive.com/jsmentors@googlegroups.com/ >>>>>> >>>>>> To unsubscribe from this group, send email to >>>>>> jsmentors+unsubscr...@googlegroups.com >>>>>> >>>>> >>>> >>> >> > -- To view archived discussions from the original JSMentors Mailman list: http://www.mail-archive.com/jsmentors@jsmentors.com/ To search via a non-Google archive, visit here: http://www.mail-archive.com/jsmentors@googlegroups.com/ To unsubscribe from this group, send email to jsmentors+unsubscr...@googlegroups.com