Hi All
I have put in place some new functionality which I'm calling User Defined
Functions – it is essentially a lightweight way for users to define new
functions for use in their SPARQL queries without having to write the Java code
for the function themselves. This means it is less powerful than adding a full
extension function as you can't use arbitrary Java code but it provides a
simple way to encapsulate complex or large expressions into simple function
calls, in essence it is an expression aliasing mechanism.
For example we can define a "square" function like so:
List<Var> args = new ArrayList<Var>();
args.add(Var.alloc("x"));
UserDefinedFunctionFactory.getFactory().add("http://example/square", "?x * ?x",
args);
Then we can go ahead and use this in queries:
SELECT (<http://example/square>(3) AS ?ThreeSquared) { }
Expressions can be defined either by providing a raw expression string which
conforms to the SPARQL expression syntax of by programmatically building an
Expr instance.
Bear in mind that this functionality only goes so far and there are some
provisos to this functionality that I should point out.
1 – Dependencies between user defined functions
It is possible to define functions that depend on other user defined functions
but this is risky because if the other function definition is changed/removed
your function definition may change. To avoid this the default behavior is not
to preserve your dependencies but rather to expand out the function
definitions. So for example I could define a "cube" function as follows:
List<Var> args = new ArrayList<Var>();
args.add(Var.alloc("x"));
UserDefinedFunctionFactory.getFactory().add("http://example/cube",
"<http://example/square>(?x) * ?x", args);
Internally that is the same as if I defined it as follows since the definitions
will be fully expanded to include the definitions of the other user defined
functions used:
List<Var> args = new ArrayList<Var>();
args.add(Var.alloc("x"));
UserDefinedFunctionFactory.getFactory().add("http://example/cube", "(?x * ?x) *
?x", args);
This protects users from changing definitions, however sometimes dependencies
may be desired in which case this behavior can be disabled - see
UserDefinedFunctionFactory.getPreserveDependencies()
Expansion happens at definition time when you call add() so if you want to
change the behavior you would need to redefine all functions that may be
affected by it.
2 – Function overloading
Currently there is no support for function overloading, so if you want to
define a function that has varying numbers of arguments you have to define a
different URI for each variant right now. This is something I intend to add I
just haven't got round to it yet.
3 – Argument Lists
User defined functions treat all variables in the expression as arguments, if a
variable is used in the expression it must be in the argument list or an error
will be thrown when trying to define the function. A variable may be in the
argument list and not used in the function and this only results in a warning.
I may change the latter case to actually throw an error and move from using
argument lists to ordered sets (LinkedHashSet) to prevent duplicate arguments.
4 – Overriding Function Libraries
It is possible right now to define a function that overrides an extension
function provided by another function library e.g. the ARQ one. I'm not sure
whether this should be prevented or not, any thoughts?
Let me know what you think and any ideas for refinement beyond what I already
listed here,
Rob