Hi all, I am looking at what sort of grammar can be used to represent and edit the content MathML included inside components and reactions in CellML models. I propose the following grammar (this is a valid and unambiguous bison grammar). This has been designed to be intuitive to people who know MATLAB (or C, or other languages which use a mixture of pre and in-order operators). It would be nice if there was some consistency between tools, so I am interested in feedback both from potential users, and the developers of other tools.
%token IDENTIFIER %token QUOTED NUMBER %token PIECEWISE CASE THEN ELSE %left T_EQEQ T_NOTEQ T_GREATEREQUAL T_LESSEQUAL '<' '>' %left '!' %left '+' '-' %left '*' '/' %% textrep: expr '=' attrs expr | textrep '=' attrs expr; attrs: '{' attrlist '}' | /* empty */; attrlist: attr attrlist | /* empty */; attr: IDENTIFIER '=' QUOTED; arglist: expr ',' arglist | expr; expr: IDENTIFIER attrs '(' arglist ')' | '(' expr ')' | expr '*' attrs expr | expr '+' attrs expr | expr '-' attrs expr | expr T_EQEQ attrs expr | expr T_NOTEQ attrs expr | expr '<' attrs expr | expr '>' attrs expr | expr T_GREATEREQUAL attrs expr | expr T_LESSEQUAL attrs expr | expr '/' attrs expr | IDENTIFIER attrs | NUMBER attrs | PIECEWISE attrs '(' caselist maybeelse ')' | '!' attrs expr | '-' attrs expr; caselist: casepair caselist | /* empty */; casepair: CASE expr THEN expr; maybeelse: ELSE expr | /* empty */; Examples of valid syntax: x ={id="myequation"} piecewise(case y==1 then abs(z*y) / integral(sin(a), a, 0, 10) case y < 1 then pow(y, z) else -(y + z - 3{units="volts"}*c)) y * z = a * b = c / pow(2, remainder(5, 3)) / 7{units="second"} diff(position, time) = velocity a = sin{definitionURL="http://www.weirdmath.org/my-weird-version-of-sine#"}(pi * x) There is one minor issue with the identifier syntax, and that is the possibility that names can conflict, e.g. if we had a variable called pi. We would have to mangle it (I suggest that the tokeniser would use # to override symbol names, so, for example #pi maps to <ci>pi</ci>, while pi maps to <pi/>, but of course, # is not normally required, so x maps to <ci>x</ci>). My editing syntax allows the encoding of things which are valid MathML, but which no tools will be able to be used (e.g a CellML tool would be able to integrate equations like the last one, as the semantics of sine have been overriden using definitionURL). The aim is to represent all valid content MathML, rather than everything that PCEnv or any other tool supports. This means that going in to edit mode on an equation, not changing anything, and then returning back is guaranteed not to change the semantic meaning of the model. The units attribute will default to dimensionless if not present, but can be explicitly set on constants. For the IDENTIFIER attrs '(' arglist ')', all standard MathML operators (other than those for which an in-order operator is defined) will be supported as IDENTIFIER, as well as some special values for tagging certain qualifiers (the integral example above uses lowlimit and uplimit, which are inferred from their position, but you could alternatively write something like integral(sin(a), a, condition(in(a, interval{type="open"}(0, 10))) ). Opinions? Best regards Andrew _______________________________________________ cellml-discussion mailing list cellml-discussion@cellml.org http://www.cellml.org/mailman/listinfo/cellml-discussion