Part of Markus's email was that he didn't want to write a separate python rule. That is why I thought that putting the scripts inline in the metavariable declaration might be more acceptable. But we can see what he thinks about the proposed solution, since it is structured in a somewhat different way than the python code he would currently have to write.

Does python have anonymous functions? If not, in python one will have to write a named function, and that could be confusing.

Already you can define functions in the initialize block of the semantic patch. So it is not necessary to duplicate anything more than the function call.

I guess the purpose of making c abstract is to have a metavariable for which the matched code has a certain form. The could be useful, but I worry a bit about the information being dispersed (in the function call case as well...).

I'm not sure that abstract is such a good name. The current word for such things inside the implementation is constraint.

Since we are not targeting functional programmers, I don't think that : will suggest a type declaration to many people.

I think that overall the isssue comes down to how self-contained one would like the rules to be.

julia



On Fri, 16 Dec 2011, Arie Middelkoop wrote:

I'd like to push another option on the pile...

I want to introduce... a _rule_ constraint. Here is the example again with some 
additional syntax for constraints:

@@
identifier f = rule r;
@@

* f(3)

It says: accept an identifier as f satisfies _abstract_ rule r. What is this r 
then?
It is almost a normal rule, but with some subtle differences:

@abstract r script:ocaml@
@@

fun x -> not (List.mem x badnames)

The script must be a function that takes an AST as argument and returns a 
boolean that indicates whether the argument is acceptable. Thus, when we then 
want to check if an identifier satisfies r, we thus execute the script with 
that identifier as parameter.
(if the script is python, it must be some function that takes a string as first 
parameter, and perhaps instead of returning a boolean, it calls an API function 
that tells whether or not the input is (un)satisfactory)

Now some subtleties:
a) an abstract rule may appear anywhere in the file, as long as the meta 
variables that it may inherit exist in the environment.
b) you may not inherit from an abstract rule. In case of a script this is 
already the case.
c) abstract rules are only executed/matched through rule constraints


*********


This approach does not have the problems that arise when you inline scripts in 
constraints and makes use of notation that you already have for scripts.


**********


As a nicer notation, we should write the constraint with a colon instead of an 
equal sign:
@@
identifier f : rule r;
@@
This could be interpreted as "rule r" being the type of "f", since types are 
constrains on values ;)


***********


You can also still combine rule constraints with not,or,and to get something 
like:

identifier f = rule a || rule b;

But if you use this combination often, you could then also make an abstract 
rule c that does:

identifier f = rule c;

@abstract c@
identifier x = rule a || rule b;
@@
x

Wait a second, this abstract rule is not a script!
Yes, why limit this feature to scripts if we have such an nice language at our 
disposal for writing pattern matches!

What happens here is that the body of rule c matches against an identifier x, 
which is constraint to either rule a or rule b. The body may be much more 
complicated, but it should probably be pure (i.e. not have + or - code)...


***********


I'm not sure if what I'm writing down above is already making you dizzy, but 
there is more.
You can see an abstract rule as some form of "let" abstraction: let this meta 
variable stand for a more complex pattern.

Here I've got a pattern where "e" stands for an integer expression that must be the addition of two other 
integer expressions. We give the instantiation of the abstract rule an explicit name "r" and can then inherit 
from it to get access to some of the meta variables "a" and "b" that it matches:

@@
int e = rule add as r;
int r.a;
int r.b;
@@

- f(e)
+ (a - b)

So, the add rule simply matches "a + b":

@abstract add@
int a;
int b;
@@

a + b


***********

I hope I did not dazzle you too much with this. I think that at least the very 
first part of this message deserves some attention.

Cheers,
Adriaan


Julia,

This proposal sounds very useful to me.

I would not want to remove any of the existing functionality.
As it currently stands cocci can be used by people who only
know C and I think it would be useful to keep this ability.

Relating to your C++ post earlier this week.  I think it was in
the later 90s that somebody told me that writing a C++ compiler
was about seven times the effort of writing a C compiler.  A lot
of new stuff has been added in C++11 and not so much in C1x,
so I suspect the ratio will have gone up.

Perhaps a solution would be to allow scripting code in metavariable
declarations. Then we could in principle get rid of all sorts of
constraints. There would be no need to learn the SmPL constraints and
where they could occur. One would just have to remember the syntax of
one's preferred scripting language (from among the optons available :).

So for example, one could write:

@initialize:ocaml@
let badnames = ["one";"two";"three"]

@@
identifier x where ocaml{not (List.mem x badnames)};
@@

*f(3)

I imagine that is it possible to do the same thing in python.

Similarly, one could get rid of the regular expression matching
notation. I assume python provides something for that, for those that
don't want to use ocaml. Note that the interaction with python might be
less efficient than the current native ocaml version.

One could also get rid of the subterm notation, expression e <= r.e1;,
although that would currently require someone who wants this
functionality to use ocaml, because currently only ocaml code gets a
representation of the abstract syntax tree.

An issue is what metavariables this code can use. In the above, I have
assumed that the ocaml code is implicitly parameterized by the
metavariable that is being declared. It would be too complicated to
allow the code to have access to other metavariables being declared at
the same time. But if an appropriate syntax for declaring them could be
found, it would be possible to allow metavariables to be inherited from
previous rules. Currently we have @@ to separate metavariable
declarations from the script code. Perhaps we could use that, although
it seems a bit ugly. Another option would be to have no separator. The
end of the metavariable list would be the occurrence of the last x << r.y;

What do you think?

julia
_______________________________________________
Cocci mailing list
[email protected]
http://lists.diku.dk/mailman/listinfo/cocci
(Web access from inside DIKUs LAN only)


--
Derek M. Jones                  tel: +44 (0) 1252 520 667
Knowledge Software Ltd          blog:shape-of-code.coding-guidelines.com
Source code analysis            http://www.knosof.co.uk
_______________________________________________
Cocci mailing list
[email protected]
http://lists.diku.dk/mailman/listinfo/cocci
(Web access from inside DIKUs LAN only)


_______________________________________________
Cocci mailing list
[email protected]
http://lists.diku.dk/mailman/listinfo/cocci
(Web access from inside DIKUs LAN only)

Reply via email to