=> uAvBw -> uxvxw, where the x's may be different x's.
=>
=> This is intuitively context-sensitive (but not quite complying to the
=> spec of uAv -> uxv). In the sense of all terminals on the left side
=> being preserved. But handling that is hard, IMO anyway.
=>
=> How would you handle such a beast for any number of
=> terminal-nonterminal-terminal-... combinations?
How about hand-rolling a regexp for each RHS, based on the
LHS? The code below uses $xTerm, $xNon_term and $xAlphabet
from the previous email, and shows how &check_CH1 can be
enhanced to handle this new definition of CH-1 grammars.
The "|*|" lines contain the new code.
|| # *** Incomplete code follows! ***
|| # *** Regexp definitions deleted for brevity. ***
|| sub check_CH1 {
|| # *** Comments deleted for brevity. ***
|| my @productions = @_;
||
|| foreach my $production (@productions) {
|| # Make a rule from $production, by stringifying the
|| # left and right sides.
|| my $rule = join('', @{$production->{left}}, $arrow,
|| @{$production->{right}});
||
|| return 0 if $rule !~ m{
|| ($xTerm*) $xNon_term+ ($xTerm*) # LHS
|| $xArrow
|| $1 $xAlphabet* $2 # RHS
|| }x;
||
|*| my $sxRHS = join '', @{$production->{left}};
|*| $sxRHS =~ s/($xTerm+)/"\Q$1"/ge;
|*| $sxRHS =~ s/$xNon_term+/.*/g;
|*| return 0 if join('', @{$production->{right}}) !~ m/$sxRHS/;
|| }
|| 1;
|| } # &check_CH1
||
|| # *** Incomplete code above! ***
This code is just experimental, but it may get you on the
right track.
If "\Q$1" happens to match $xNon_term, then we are in
trouble. But assuming the terminals and non-terminals are
cleanly separated sub-alphabets, this approach should work.
peace, || From Child Labour to Scholarship
--{kr.pA} || http://makeashorterlink.com/?J17F21842
--
"The war [on Iraq] is a weapon of mass distraction." -- Seen on a placard.