Re: Standardizing property functions and/or full text search in SPARQL

Andy Seaborne Fri, 09 Mar 2012 13:30:41 -0800

A bit of history:

The idea of property functions is copied from cwm/N3 (called built-inproperties). A good property function is one that describes meaning:


 (?a ?b) math:sum ?c .

That expresses a relationship. Of course, in practice it can't runbackwards but if there is a set of ?a ?b ?c then they have arelationship just like normal properties. It does subtly assumesomething about the way execution happens in that parts of the BGPlexically before the property function have bound variables before theproperty function is called. Normally, a BGP over triples can beexecuted in any order - you get the same answers; it just changes thenumber of negative cases to consider.

It also assumes lists are not structures of triples but first-classitems in the data model. Seem like that is less of abuse but they dosqueeze something into strict SPARQL syntax. For what ever reason,people are more comfortable with that than with syntax extensions.

So when SPARQL-WG did the features and requirements definition phase, wedecided not to formalize property functions. Indeed, relying on the"good property function" characterisation, you can argue there isnothing to define. Just because a relationship is computed, anddirectly in the data is not important.

The fact it might affect some engines doing different evaluation made itpolitically sensitive.

The one case what argued for them was text search. It does not requireproperty functions, they just squeeze them into SPARQL 1.0 syntax. TheWG could have decided to text search with special syntax and notproperty functions. As property functions, a text search does express arelationship, possibly indirect, between a thing (literal, document) anda text query string.

The text search has other issues: there isn't a standard syntax for textsearch and it looked like a monster work item.

For regexs, SPARQL uses XSD Function and Operators regex language [1].And that is so close to Java, Perl etc etc that it makes a differenceonly to picky implementers and no one else [3].

For free text, back then, Lucene syntax was common but not nearly asuniversal as Perl regex, which has displaced variations and is availablein C, Perl, Java and all their friends. So the group would have to atleast survey existing candidates and define the language; XPath fulltext [2] wasn't finished then.

I doubt the WG could have done full free-text a la XPath full text.Even a subset would be significant to spec and test.


It would displace other things, given the WG has bounded resources.

The other issue was the amount of work it would take to implement.Regex implementations exist for (nearly?) every language.

Free text looked like it would require SPARQL implements to implement alarge piece of work. OK(maybe) if you can use Lucene or a clone, butthat isn't the situation for everyone.

It felt at the time like free-text and not much else. Aggregates werecommonly implemented, a clear need and known to be practical (even so,there has been some resistance to the amount work they need). Textsearch was too big a topic to undertake.


        Andy

[1] http://www.w3.org/TR/xpath-functions/#regex-syntax
[2] http://www.w3.org/TR/xpath-full-text-10/
[3] Look at the flags.
    Even ARQ uses Java by default and Xerces on request
    (Xerces has an exact XSD regex engine)


On 09/03/12 19:56, Robert Vesse wrote:

Hi Frank

I do not believe either of these are on the agenda for the current
round of SPARQL standardization but it may be worth you suggesting
these for inclusion as a Future Work item on the comments mailing
list - [email protected] - so that they can be included
on the list at http://www.w3.org/2009/sparql/wiki/Future_Work_Items
and feed into any future SPARQL working group

FWIW there are already a number of interoperable implementations of
the LARQ style syntax already out in the wild - my own dotNetRDF
implements this as does Clark&  Parsia's Stardog and possibly others
I'm not aware of.  Also property functions in general are widely
implemented for a variety of purposes in a whole variety of triple
stores and SPARQL engines.

The slightly subversive property function syntax is slightly awkward
and at odds with the pure SPARQL specification but it address the
general limitation of extension functions in SPARQL that they can
only return a single value and the 1.0 specific limitation that you
could not actually bind the result of an extension function to a
variable.  Even with BIND in SPARQL 1.1 you can only assign a single
value in a BIND so either you'd have to have multiple extension
functions to get the matches and the scores (and then how do you
relate them)

Rob

On Mar 9, 2012, at 11:44 AM, Frank Budinsky wrote:



Hi,

I'm trying to get a handle on the strategic implications of using
Jena property functions, and specifically the LARQ textMatch
property function approach for supporting full text search.

Does anybody know if there is anything in the works to try to
include property functions in a future version of SPARQL? I noticed
a small amount of discussion about this back in 2008, but haven't
seen anything since. I see that they are not part of the standard
SPARQL 1.1 specification and don't even appear to be an avenue of
extension envisioned by the SPARQL 1.1 specification, which
envisions extension value functions and entailment regimes.

It seems that  the syntax of property functions borrows the syntax
of legitimate SPARQL queries but gives it an
implementation-specific meaning that runs counter to SPARQL
semantics. Has there been any attempt to reconcile what property
functions do with the semantics of SPARQL 1.1, as described in
chapter 18: http://www.w3.org/TR/sparql11-query/#sparqlDefinition?

Thanks, Frank

Re: Standardizing property functions and/or full text search in SPARQL

Reply via email to