Hi, all. I'm writing a parser to rewrite end user's search terms to a particular search engine API.
I'm trying to parse out phrases, defined as any group of words - quoted or not - that isn't a reserved word (boolean operators mostly). I can't seem to come up with a grammar that will return the longest run of words that doesn't contain a boolean. I've tried adding lookaheads in various spots with no luck -- booleans are simply parsed as words, not reserved words. Below is my test app with broken grammar and test phrases -- am I missing something painfully obvious? -- Neil Kohl Manager, ACP-ASIM Online [EMAIL PROTECTED] #!/usr/local/bin/perl # $Id: test.pl,v 1.3 2002/10/16 20:04:36 neilk Exp neilk $ use Data::Dumper; use Parse::RecDescent; use strict; $|++; $::RD_AUTOACTION = q { [@item[0..$#item]]; }; $::RD_HINT=1; $::RD_WARN=1; # $::RD_TRACE=1; my $grammar =<<'EOG'; query: phrase (reserved_word phrase)(?) | <error> phrase: quoted_phrase | nekkid_phrase quoted_phrase: '"' /[^\"]+/ '"' | "'" /[^\']+/ "'" nekkid_phrase: word(s) ...!reserved_word reserved_word: /AND\s+NOT|AND|NOT|OR|NEAR/ word: /[^\s\{\}\(\)]+/ EOG my $p = new Parse::RecDescent($grammar) or die; while (<DATA>) { chomp; print $_, " -> "; my $out = $p->query($_); print Dumper($out), "\n"; } exit; __END__ test AND of OR all AND NOT boolean NEAR operators NOT defined simple phrase query "quoted string" "quoted string's with apostrophe" 'single quoted string' simple phrase AND "quoted string" simple phrase AND another phrase one AND two NEAR three phrases this OR that AND the other thing NOT that one broken boolean query AND single quote's