Apparently, it is a known problem and the solution is to write the CFG
differently...
http://www.google.fr/url?sa=t&source=web&cd=7&ved=0CEsQFjAG&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.3.9583%26rep%3Drep1%26type%3Dpdf&ei=uLmUTqzQMcfrsgaZv6C-BQ&usg=AFQjCNHI7EfNrHlqmZYMIXEDhxibfnrZcg&sig2=Zv7847-YG0CxbfnMbEc4Lw

On Tue, Oct 11, 2011 at 8:37 PM, Xavier MONTILLET
<xavierm02....@gmail.com> wrote:
> I guess I could try to add a loop that increments a variable
> containing the number of time it should read the same rule while it
> doesn't match the whole string and the number of characters matched
> increases with the variable...
> But it looks like going around the problem without really fixing it...
>
> On Tue, Oct 11, 2011 at 7:57 PM, Xavier MONTILLET
> <xavierm02....@gmail.com> wrote:
>> I just reformated the whole thing and it kind of works.
>> http://jsfiddle.net/xavierm02/ndDVa/1/
>> I get a CST and I think I can manage to turn it into an AST.
>> But I still have a problem with the CFG... As shown, it works
>>
>> But if I change the order of the things to match it fails. At last on
>> recursive rules.
>> Here is an example :
>>
>> Text
>>    Text Letter
>>
>> The code about won't work with my "parser" but if i rewrite it to
>> avoid having the recursive rule first, it works:
>>
>> Text
>>    Letter Text
>>
>> The above example works...
>>
>> But I'm quite sure I won't always be able to avoid having a recursive
>> rule as first rule so... Do you guys see anything I could do to avoid
>> it?
>>
>> I tried allowing a given number of recursion before simply returning
>> false when seeing a recursive rule but if you have more than that
>> number, it fails :/
>> And it I don't do it at all, I get an infinite loop.
>>
>> Here's the exact same code except I changed the order in the
>> declaration of MultiplicativeExpression or AdditiveExpression.
>> http://jsfiddle.net/xavierm02/ndDVa/2/
>>
>> On Sun, Oct 9, 2011 at 8:16 PM, Xavier MONTILLET
>> <xavierm02....@gmail.com> wrote:
>>> I figured out how to get something looking like an AST.
>>> http://jsfiddle.net/xavierm02/HG6Bx/1/
>>> But comments and ideas on how to throw errors are still welcome.
>>>
>>> On Sun, Oct 9, 2011 at 6:10 PM, Xavier MONTILLET
>>> <xavierm02....@gmail.com> wrote:
>>>> And how do you throw errors with a CFG ?
>>>> I mean JSLint does throw errors but the Lexer doesn't use a CFG...
>>>> If you use a CFG, how do you throw errors ? You take the longest /
>>>> deepest "possibility" that had a meaning ?
>>>>
>>>> On Sun, Oct 9, 2011 at 4:17 PM, Xavier MONTILLET
>>>> <xavierm02....@gmail.com> wrote:
>>>>> Btw, I have no knowledge in Computer Science at all. So if you know
>>>>> tutorials / books explaining how to build a parser without needed too
>>>>> much knowledge, please tell me. Because every single tutorial I tried
>>>>> to read started with strange formulas I couldn't understand and,
>>>>> instead of providing pseudo-code, just described what the code should
>>>>> do, gave mathematical methods which I couldn't understand to check
>>>>> whether the result was going to be what we wanted or not and so on...
>>>>>
>>>>> (I know JavaScript quite well because I started doing websites in
>>>>> high-school but since I left High-school (last year), I've only been
>>>>> doing Maths and Physics in "classes preparatoire" (a France-specific
>>>>> system where you do a lot of Maths / Physics for two years in order to
>>>>> prepare "contests" which, depending on what your ranking is, will
>>>>> allow you to choose a given school or not. And I won't be doing any
>>>>> computer science until then...)
>>>>>
>>>>> On Sun, Oct 9, 2011 at 4:05 PM, Xavier MONTILLET
>>>>> <xavierm02....@gmail.com> wrote:
>>>>>> I've started writing the Lexer and for it I used some kind of CFG.
>>>>>> But I'm not sure whether I understood the aim of those things
>>>>>> properly. I mean is the CFG supposed to let you build a syntax tree or
>>>>>> just a token list ? Because I can't figure out how to build the syntax
>>>>>> tree from it...
>>>>>> Anyway, I wrote a CFG for basic Maths :
>>>>>>
>>>>>> Digit
>>>>>>    '0'
>>>>>>    '1'
>>>>>>    '2'
>>>>>>    '3'
>>>>>>    '4'
>>>>>>    '5'
>>>>>>    '6'
>>>>>>    '7'
>>>>>>    '8'
>>>>>>    '9'
>>>>>>
>>>>>> Operator
>>>>>>    '+'
>>>>>>    '-'
>>>>>>    '*'
>>>>>>    '/'
>>>>>>
>>>>>> Expression
>>>>>>    '(' Expression ')'
>>>>>>    Expression Operator Expression
>>>>>>    Number
>>>>>>
>>>>>> And as I didn't want to bother writing a parser to parse the CFG for
>>>>>> my parser, I transformed it in JS :
>>>>>>
>>>>>> var rules = ( function ( ) {
>>>>>>    var Digit = [ ];
>>>>>>    var Expression = [ ];
>>>>>>    var Operator = [ ];
>>>>>>    Digit.name = 'Digit';
>>>>>>    Expression.name = 'Expression';
>>>>>>    Operator.name = 'Operator';
>>>>>>    Digit.push(
>>>>>>        [ '0' ],
>>>>>>        [ '1' ],
>>>>>>        [ '2' ],
>>>>>>        [ '3' ],
>>>>>>        [ '4' ],
>>>>>>        [ '5' ],
>>>>>>        [ '6' ],
>>>>>>        [ '7' ],
>>>>>>        [ '8' ],
>>>>>>        [ '9' ]
>>>>>>    );
>>>>>>    Expression.push(
>>>>>>        [ '(', Expression, ')' ],
>>>>>>        [ Expression, Operator, Expression ],
>>>>>>        [ Digit ]
>>>>>>    );
>>>>>>    Operator.push(
>>>>>>        [ '+' ],
>>>>>>        [ '-' ],
>>>>>>        [ '*' ],
>>>>>>        [ '/' ]
>>>>>>    );
>>>>>>    return {
>>>>>>        Digit: Digit,
>>>>>>        Expression: Expression,
>>>>>>        Operator: Operator
>>>>>>    };
>>>>>> } )( );
>>>>>>
>>>>>> The name properties are here because I had some infinite loops so it
>>>>>> was easier to debug with it...
>>>>>> And I define the arrays before so that I can reference them in others.
>>>>>>
>>>>>> Then, with this, I built a lexer:
>>>>>>
>>>>>> http://jsfiddle.net/xavierm02/tUCXY/
>>>>>>
>>>>>> Is it good or not ?
>>>>>> Any comment would be welcome. The only thing I'm planning on changing
>>>>>> is the j += newTokens.join( '' ).length; I probably need it because I
>>>>>> didn't store indexes properly... Or maybe because I use readTokens
>>>>>> that calls itself instead of a loop.
>>>>>>
>>>>>> Thanks in advance for your comments.
>>>>>>
>>>>>> On Mon, Oct 3, 2011 at 9:11 PM, Xavier MONTILLET
>>>>>> <xavierm02....@gmail.com> wrote:
>>>>>>> Ok thank you :)
>>>>>>>
>>>>>>> On Sat, Oct 1, 2011 at 4:31 PM, Lasse Reichstein
>>>>>>> <reichsteinatw...@gmail.com> wrote:
>>>>>>>> There is nothing in the complexity of '+' that mandates your Abstract 
>>>>>>>> Syntax
>>>>>>>> Tree layout. It's still pure syntax at this point.
>>>>>>>> However, any use of the parsed syntax must behave as if the (+ a b c) 
>>>>>>>> syntax
>>>>>>>> tree is equivalent to (+ (+ a b) c), so it won't buy you much, and 
>>>>>>>> probably
>>>>>>>> just makes using the parsed syntax more complex.
>>>>>>>> If repeated use of the same operator happens a lot, and you can easily
>>>>>>>> handle the pairing later, by all means do make a single vairable-sized 
>>>>>>>> node.
>>>>>>>> E.g., in the RegExp grammar, adjacent literal characters are really a
>>>>>>>> sequence of Alternatives of Terms (of Atoms), but it happens so often 
>>>>>>>> that
>>>>>>>> you probably want to parse /abel.*/ as containing a single text-node 
>>>>>>>> with
>>>>>>>> the text "abel".
>>>>>>>> Just don't optimize the abstract syntax without considering how it's 
>>>>>>>> going
>>>>>>>> to be used.
>>>>>>>> /L
>>>>>>>>
>>>>>>>> On Sat, Oct 1, 2011 at 12:28 PM, Xavier MONTILLET 
>>>>>>>> <xavierm02....@gmail.com>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Right. I totally forgot that...
>>>>>>>>> So I have to keep only two operands per operator.
>>>>>>>>> Thank you for answering :)
>>>>>>>>>
>>>>>>>>> On Sat, Oct 1, 2011 at 12:48 AM, Poetro <poe...@gmail.com> wrote:
>>>>>>>>> > The + operator is not just for numbers and can have sideeffects, if
>>>>>>>>> > the operands have different types or the operands are not numbers or
>>>>>>>>> > strings (and even in that case). This makes the + operator tricky.
>>>>>>>>> >
>>>>>>>>> >>>> "boo" + 1 + 2
>>>>>>>>> > "boo12"
>>>>>>>>> >>>> 1 + 2 + "boo"
>>>>>>>>> > "3boo"
>>>>>>>>> >
>>>>>>>>> >>>> var a = {toString: function () { return 1; }, valueOf: function 
>>>>>>>>> >>>> () {
>>>>>>>>> >>>> return 2; }}, b = 0; a + b
>>>>>>>>> > 2
>>>>>>>>> >
>>>>>>>>> > --
>>>>>>>>> > Poetro
>>>>>>>>> >
>>>>>>>>> > --
>>>>>>>>> > To view archived discussions from the original JSMentors Mailman 
>>>>>>>>> > list:
>>>>>>>>> > http://www.mail-archive.com/jsmentors@jsmentors.com/
>>>>>>>>> >
>>>>>>>>> > To search via a non-Google archive, visit here:
>>>>>>>>> > http://www.mail-archive.com/jsmentors@googlegroups.com/
>>>>>>>>> >
>>>>>>>>> > To unsubscribe from this group, send email to
>>>>>>>>> > jsmentors+unsubscr...@googlegroups.com
>>>>>>>>> >
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> To view archived discussions from the original JSMentors Mailman list:
>>>>>>>>> http://www.mail-archive.com/jsmentors@jsmentors.com/
>>>>>>>>>
>>>>>>>>> To search via a non-Google archive, visit here:
>>>>>>>>> http://www.mail-archive.com/jsmentors@googlegroups.com/
>>>>>>>>>
>>>>>>>>> To unsubscribe from this group, send email to
>>>>>>>>> jsmentors+unsubscr...@googlegroups.com
>>>>>>>>
>>>>>>>> --
>>>>>>>> To view archived discussions from the original JSMentors Mailman list:
>>>>>>>> http://www.mail-archive.com/jsmentors@jsmentors.com/
>>>>>>>>
>>>>>>>> To search via a non-Google archive, visit here:
>>>>>>>> http://www.mail-archive.com/jsmentors@googlegroups.com/
>>>>>>>>
>>>>>>>> To unsubscribe from this group, send email to
>>>>>>>> jsmentors+unsubscr...@googlegroups.com
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

-- 
To view archived discussions from the original JSMentors Mailman list: 
http://www.mail-archive.com/jsmentors@jsmentors.com/

To search via a non-Google archive, visit here: 
http://www.mail-archive.com/jsmentors@googlegroups.com/

To unsubscribe from this group, send email to
jsmentors+unsubscr...@googlegroups.com

Reply via email to