How to speed up a grammar

2009-06-06 Thread Richard Hainsworth

Leon Timmermans wrote:

If you want to write a fast parser for XML, preventing backtracking is
going to be quite essential. I suspect the problem is your grammar,
not the grammar engine itself. You could post it to perl6-users and
ask for advice on it.

Leon


Below is the grammar.

I am only interested in tag names xs1:text xs1:playlist (and within 
playlist) xs1:item

The only attributes I want are 'author', 'title', and 'id'

grammar sony_grammar;
rule TOP {
 ^
 xmldecl?
 root=element
 $
}

rule xmldecl {
  '?xml'
  'version'  '=' '' $version=-[\]+ ''
  'encoding' '=' '' $encoding=-[\]+ ''
  '?'
}

rule element {
 '' name attribute*
 [
 | '/'
 | '' element* '/' $name ''
 ]
}

rule attribute { name '=' '' $value=-[]* '' }

token name { namespace? $ename=ident }
token namespace { ident ':' }




Re: How to speed up a grammar

2009-06-06 Thread Minimiscience

On Jun 6, 2009, at 10:32 AM, Richard Hainsworth wrote:

rule element {
'' name attribute*
[
| '/'
| '' element* '/' $name ''
]
}


This is just a wild, uneducated, possibly delusional guess, but I  
don't think that vertical bar before the '/' should be there.  I  
think it might be causing the grammar engine to check whether it can  
omit the ending of each tag and attach it to the enclosing tag  
instead, which it can't confirm without examining the whole file at  
least once for each tag.


-- Minimiscience