Hi,
I need a way to terminate a subrule and allow the higher rule to start a new node. The subrule allows the alternative "space number" but it appears to be stuck on space and expecting it to be followed by a number. Grammatica issued an error for unexpected token:
reference(2001)
SDO(2007)
ieee(1019): "IEEE", line: 1, col: 1
ws(1043): " ", line: 1, col: 5
alphaNumWDelim(2022)
alphaNum(2023)
number(1048): "1003", line: 1, col: 6
morestuff(2024)
delimiter(2019)
period(1003): ".", line: 1, col: 10
morestuff(2024)
alphaNum(2023)
number(1048): "1", line: 1, col: 11
morestuff(2024)
delimiter(2019)
dash(1001): "-", line: 1, col: 12
morestuff(2024)
alphaNum(2023)
alpha(1050): "X", line: 1, col: 13
morestuff(2024)
ws(1043): " ", line: 1, col: 14
number(1048): "49", line: 1, col: 15
morestuff(2024)
ws(1043): " ", line: 1, col: 17
Error: in C:\brianm\parseInput2.txt: line 1:
unexpected token "SUPP", expected <number>
IEEE 1003.1-X 49 SUPP 1A VOL 2
^
Note that "49" needs to be included with the base number "1003.1-X" but the keyword "SUPP" (abbreviation of SUPPLEMENT) needs to be in the suffix node. I realize that you prefer to %ignore% the space character but it is needed to enable references "MIL-STD-123-7 SUPP 1A" and "MIL-STD-123 SUPP 1A" (the impliedRev subrule handles the "-7" implied revision). The grammar follows:
%header%
GRAMMARTYPE = "LL"
DESCRIPTION = "A grammar for translating document number reference
into structured keys: family, revision, suffix, etc."
%tokens%
dash = "-"
slash = "/"
period = "."
lparen = "("
rparen = ")"
mil = "MIL"
dod = "DOD"
jan = "JAN"
std = "STD"
hdbk = "HDBK"
prf = "PRF"
dtl = "DTL"
qml = "QML"
qpl = "QPL"
cfr = "CFR"
usc = "USC"
sae = "SAE"
astm = "ASTM"
ieee = "IEEE"
rev1 = "REVISION"
rev2 = "REV"
part1 = "PART"
part2 = "PT"
chapter1 = "CHAPTER"
chapter2 = "CHAP"
volume1 = "VOLUME"
volume2 = "VOL"
validNotice1 = "VALID NOTICE"
validNotice2 = "VAL NOTICE"
cancNotice = "CANC NOTICE"
canc = "CANC"
interimChg1 = "INTERIM CHANGE"
interimChg2 = "INT CHG"
amd1 = "AMEND"
amd2 = "AMD"
interimAmd1 = "INTERIM AMD"
interimAmd2 = "INT AMD"
supplement1 = "SUPPLEMENT"
supplement2 = "SUPP"
oo = "-00" // preceding dash required for in-lieu-of (unable to isolate leading zeroes)
oh = "-0H" // both in-lieu-of and handbook
sparen = " (" // get around problem with ws lparen
ws = " " // plain old space character
delim = <<[+=",':;!#$%&*<>[EMAIL PROTECTED]>>
fluff = <<[\t\n\r\f\a\e]+>> %ignore% // do not explicitly handle unprintable
number = <<[0-9]+>> // one or more digits
string = <<[A-Z][A-Z]+>>
alpha = <<[A-Z]>>
%productions%
reference = ( milspec | fedspec | QPLdoc | regulation | SDO ) suffix* ;
milspec = milPrefix basic ;
fedspec = fedPrefix basic ;
QPLdoc = ( qpl | qml ) dash ( fedspec | basic | SDO ) ; // Qualified Products List
regulation = [string] regPattern alphaNumWDelim ; // optional string represents govt agency
SDO = ( sae | astm | ieee ) [ delimiter | ws ] alphaNumWDelim ;
suffix = impliedAmend | ( [ws] keyword [ws] value? ) ;
keyword = rev1 | rev2 | part1 | part2 | chapter1 | chapter2
| volume1 | volume2 | validNotice1 | validNotice2
| cancNotice | canc | amd1 | amd2 | interimAmd1 | interimAmd2
| interimChg1 | interimChg2 | supplement1 | supplement2
| string | alpha ;
value = [delimiter] alphaNumWDelim+ ;
milPrefix = ( mil | dod | jan ) dash [middleAlpha] ;
basic = baseNum [slashNum] ;
baseNum = number [impliedRev] ;
slashNum = ( slash | period ) number [impliedRev] ;
fedPrefix = ( string | alpha ) dash alpha indicators ;
regPattern = number ( cfr | usc ) [ string | alpha ] ;
middleAlpha = ( prf | dtl | hdbk | std | alpha ) [indicators] ;
indicators = oo | oh | (dash alpha) | dash ; // in-lieu-of or both or hdbk (single H)
delimiter = dash | slash | period | lparen | rparen |delim ;
impliedAmend = sparen alphaNum+ rparen ; // implied amendment
impliedRev = ( dash number [ string | alpha ] ) | ( string | alpha ) ;
alphaNumWDelim = alphaNum morestuff* ;
alphaNum = number | string | alpha ;
morestuff = alphaNum | delimiter | ws number ;
Thanks in advance for your help, Brian.
By the way, I am an associate of Anant Mistry.
_______________________________________________ Grammatica-users mailing list [email protected] http://lists.nongnu.org/mailman/listinfo/grammatica-users
