Re: Lang and dt in the graph. Was: Dumb SPARQL query problem

2013-12-02 Thread Hugh Glaser

On 2 Dec 2013, at 06:24, Ross Horne ross.ho...@gmail.com wrote:

 Andy is right (as usual!). With the proposed bnode encoding, the graph 
 becomes fatter each time the same triple is loaded.
But how much fatter was the question.
  
 RDF 1.1 has just fixed the mess caused by blurring the roles of the lexer and 
 the parser, as summarised by David recently: 
 http://lists.w3.org/Archives/Public/public-lod/2013Nov/0093.html
Ah yes, I forgot that everything is rosy now with 1.1 - sorry.
 
 Please don't get back into mixing up the lexer and the parser. The lexical 
 spaces of the basic datatypes are disjoint, so in any language we can just 
 write:
  - 999  instead of 999^^xsd:integer
  - 9.99 instead of 9.99^^xsd:decimal
  - WWV instead of WWV^^xsd:string
  - 2013-06-6T11:00:00+01:00 instead of 
 2013-06-6T11:00:00+01:00^^xsd:dateTime
 
 As part of a compiler [1], a lexer gobbles up characters, e.g. 999, and turns 
 the characters into a token. A token consists of a string, called an 
 attribute value, plus a token name, e.g. 999^^xsd:integer. Only a 
 relatively small handful of people writing compilers for languages should 
 have to care about how tokens are represented, not end users of languages.
Well personally I prefer the first version I used for my course on this when it 
came out in 1977, the Dragon Book - Principles of Compiler Design, before 
Sethi polluted it with all that type-checking stuff :-)
Actually, it wasn’t about blurring the lexer and parser - the graph semantics 
were different.
It was closer to having two representations of zero in the machine (as some 
machines used to have), and having to write code to ensure that you coped with 
both of them.

Of course your examples do raise the issue of multiple representations for the 
same thing if the user is not careful.
23.4, 23.5, 23.0, 23.2, 23, 23.1, 023.0, 023 all of which are different RDF 
terms.
Would a lexer/parser make 23.00 and 23.000 different RDF terms, I find myself 
thinking I should know, but don’t - my guess is it should.
(RDF 1.1 doesn’t seem to give guidance on this.)

And I find myself getting strangely interested in your dateTime example.
I think most lexers will reject it?
Or friendly ones will treat it as the correct lexical form:
2013-06-06T11:00:00+01:00
(You need to pad the day)

So maybe we need to get a bit more explicit about the RDF term for dateTime 
(unless I have missed it)?
That the RDF term is always in UTC? - This is what the xdd standard says.
That the RDF term always has a fractional second part? - Good question.
That the RDF term always has a timezone? - Better question.
(See http://www.w3.org/TR/xmlschema-2/#dateTime )
Or are we happy with many different representations of a given dateTime?
(Of course xsd:dateTime does get into problems with year zero, but lets not 
worry about that :-) )

But I guess my friendly RDF parser gnomes (all hail!) already have stories for 
all this.

Best
Hugh
 
 For language tags, a little simple conventional datatype subtyping (as 
 opposed to rdfs:subClassOf), could help the programmer further [2]. e.g. a 
 programmer that writes regex(WWV2013@en, WWV) clearly meant 
 regex(WWV2013, WWV) and shouldn't have to care about the distinction, 
 unless I am mistaken.
 
 Regards,
 
 Ross
 
 [1] Ullman, Aho, Lam and Sethi. Compilers: principles, techniques and tools. 
 1986
 [2] Local Type Checking for Linked Data Consumers. 
 http:/dx.doi.org/10.4204/EPTCS.123.4
 

-- 
Hugh Glaser
   20 Portchester Rise
   Eastleigh
   SO50 4QS
Mobile: +44 75 9533 4155, Home: +44 23 8061 5652





Lang and dt in the graph. Was: Dumb SPARQL query problem

2013-12-01 Thread Tim Berners-Lee

On 2013-11 -23, at 12:21, Andy Seaborne wrote:

 
 
 On 23/11/13 17:01, David Booth wrote:
 [...]
 This would have been fixed if the RDF model had been changed to
 represent the language tag as an additional triple, but whether this
 would have been a net benefit to the community is still an open
 question, as it would add the complexity of additional triples.
 
 Different.  Maybe better, maybe worse.
 
 
 Do you want all your abc to be the same language?
 
   abc rdf:lang en .
 
 or multiple languages:
 
   abc rdf:lang cy .
   abc rdf:lang en .
 
 
 ?
 
 Unlikely - so it's bnode time ...
 
 :x :p [ rdf:value abc ; rdf:lang en ] .

The nice thing about this in a n3rules-like system (where FILTER and WHERE 
clauses are not distinct and some properties are just builtins)   is that 
rdf:value and rdf:lang can be made builtins so a datatypes literal can behave 
just like a bnode with two properties if you want to.

But I have always preferred it with not 2 extra triples, just one:

:x  :p [ lang:en cat ]

which allows you also to write things like

:x :p  [ lang:en cat] , [ lang:fr chat ].

or if you use the  ^  back-path syntax of N3 (which was not taken up in turtle),

:x :p cat^lang:en,  chat^lang:fr .

You can do the same with datatypes:

:x :q   2013-11-25^xsd:date .

instead of

:x :q   2013-11-25^xsd:date .


I suggested way back these properties as a way of putting the info into the 
graph 
but my suggestion was not adopted.  I think it would have made the model
more complete which would have been a good think, though 
SPARQL would need to have language-independent query matching as a  special 
case -- but
it does now too really.
  
(These are interpretation properties.  I must really update 
http://www.w3.org/DesignIssues/InterpretationProperties.html)

Units are fun as properties too. http://www.w3.org/2007/ont/unit

Tim

 
   Andy
 




Re: Lang and dt in the graph. Was: Dumb SPARQL query problem

2013-12-01 Thread Andy Seaborne



On 01/12/13 12:25, Tim Berners-Lee wrote:


On 2013-11 -23, at 12:21, Andy Seaborne wrote:




On 23/11/13 17:01, David Booth wrote:

[...]
This would have been fixed if the RDF model had been changed to
represent the language tag as an additional triple, but whether this
would have been a net benefit to the community is still an open
question, as it would add the complexity of additional triples.


Different.  Maybe better, maybe worse.


Do you want all your abc to be the same language?

   abc rdf:lang en

or multiple languages:

   abc rdf:lang cy .
   abc rdf:lang en .


?

Unlikely - so it's bnode time ...

:x :p [ rdf:value abc ; rdf:lang en ] .


The nice thing about this in a n3rules-like system (where FILTER and WHERE 
clauses are not distinct and some properties are just builtins)   is that 
rdf:value and rdf:lang can be made builtins so a datatypes literal can behave 
just like a bnode with two properties if you want to.

But I have always preferred it with not 2 extra triples, just one:

:x  :p [ lang:en cat ]

which allows you also to write things like

:x :p  [ lang:en cat] , [ lang:fr chat ].

or if you use the  ^  back-path syntax of N3 (which was not taken up in turtle),

:x :p cat^lang:en,  chat^lang:fr .

You can do the same with datatypes:

:x :q   2013-11-25^xsd:date .

instead of

:x :q   2013-11-25^xsd:date .


This seems to bring it it's own issues.  These bnodes seem to be like 
untidy literals as considered in RDF-2004 WG.


:x  :p [ lang:en cat ]
:x  :p [ lang:en cat ]
:x  :p [ lang:en cat ]

is 6 triples.

:x :p :q .
:x :p :q .
:x :p :q .

is 1 triple.  Repeated read in same file - this already causes confusion.

:x :p cat .
:x :p cat .
:x :p cat .

is 1 triple or is it 3 triples because it's really

:x :p [ xsd:string cat ].

:x :p 123 .
:x :p 123 .
:x :p 123 .

It makes it hard to ask do X and Y have the same value for :p? - it 
gets messy to consider all the cases of triple patterns that arise and I 
would not want to push that burden back onto the application writer. 
Why can't the app writer say find me all things which a property value 
less than 45?


To give that, if we add interpretation of bNodes used in this value form 
(datatype properties vs object properties ?), so you can ask about 
shared values, we have made them tidy again.  But then it is little 
different from structured literals with @lang and ^^datatype.


Having the data model and the access model different does not gain 
anything.  The data model should reflect the way the data is accessed.


Like RDF lists, or seq/alt/bag, encoding values in triples is attractive 
in its uniformity but the triples nature always shows through 
somewhere, making something else complicated.


Andy

PS Graph leaning does not help because you can't add data incrementally 
if leaning is applied at each addition.



I suggested way back these properties as a way of putting the info into the 
graph
but my suggestion was not adopted.  I think it would have made the model
more complete which would have been a good think, though
SPARQL would need to have language-independent query matching as a  special 
case -- but
it does now too really.

(These are interpretation properties.  I must really update
http://www.w3.org/DesignIssues/InterpretationProperties.html)

Units are fun as properties too. http://www.w3.org/2007/ont/unit

Tim



Andy








Re: Lang and dt in the graph. Was: Dumb SPARQL query problem

2013-12-01 Thread Hugh Glaser
Hi.
Thanks.
A bit of help please :-)
On 1 Dec 2013, at 17:36, Andy Seaborne andy.seabo...@epimorphics.com wrote:

 
 
 On 01/12/13 12:25, Tim Berners-Lee wrote:
 
 On 2013-11 -23, at 12:21, Andy Seaborne wrote:
 
 
 
 On 23/11/13 17:01, David Booth wrote:
 [...]
 This would have been fixed if the RDF model had been changed to
 represent the language tag as an additional triple, but whether this
 would have been a net benefit to the community is still an open
 question, as it would add the complexity of additional triples.
 
 Different.  Maybe better, maybe worse.
 
 
 Do you want all your abc to be the same language?
 
   abc rdf:lang en
 
 or multiple languages:
 
   abc rdf:lang cy .
   abc rdf:lang en .
 
 
 ?
 
 Unlikely - so it's bnode time ...
 
 :x :p [ rdf:value abc ; rdf:lang en ] .
 
 The nice thing about this in a n3rules-like system (where FILTER and WHERE 
 clauses are not distinct and some properties are just builtins)   is that 
 rdf:value and rdf:lang can be made builtins so a datatypes literal can 
 behave just like a bnode with two properties if you want to.
 
 But I have always preferred it with not 2 extra triples, just one:
 
  :x  :p [ lang:en cat ]
 
 which allows you also to write things like
 
  :x :p  [ lang:en cat] , [ lang:fr chat ].
 
 or if you use the  ^  back-path syntax of N3 (which was not taken up in 
 turtle),
 
  :x :p cat^lang:en,  chat^lang:fr .
 
 You can do the same with datatypes:
 
  :x :q   2013-11-25^xsd:date .
 
 instead of
 
  :x :q   2013-11-25^xsd:date .
 
 This seems to bring it it's own issues.  These bnodes seem to be like untidy 
 literals as considered in RDF-2004 WG.
 
 :x  :p [ lang:en cat ]
 :x  :p [ lang:en cat ]
 :x  :p [ lang:en cat ]
 
 is 6 triples.
 
 :x :p :q .
 :x :p :q .
 :x :p :q .
 
 is 1 triple.  Repeated read in same file - this already causes confusion.
 
 :x :p cat .
 :x :p cat .
 :x :p cat .
 
 is 1 triple or is it 3 triples because it's really
Is it not 1 triple if you take the first view or 6 triples if you take the 
second?
Or probably I don’t understand bnodes properly!?
 
 :x :p [ xsd:string cat ].
 
 :x :p 123 .
 :x :p 123 .
 :x :p 123 .
 
 It makes it hard to ask do X and Y have the same value for :p? - it gets 
 messy to consider all the cases of triple patterns that arise and I would not 
 want to push that burden back onto the application writer. Why can't the app 
 writer say find me all things which a property value less than 45?
I see it makes it hard, but I don’t see it as any harder than what we have now, 
with multiple patterns that do and don’t have ^^xsd:String
As I said before, with the ^^xsd you need to consider a bunch of patterns to do 
the query - again, it is messy, but is it messier?

Actually I find
 { ?s1 ?p [ xsd:string ?str ] . ?s2 ?p [ xsd:string ?str ] . }
with a possible also
 { ?s1 ?p ?str . ?s2 ?p ?str . }
much easier to work with than something that has this stuff optionally tacked 
on the end of literals, that isn’t really part of the string but isn’t part of 
RDF either.
Or maybe it is part of the literal but not the string? Surely that should be 
clear to me?

I just don’t see there is a difference in complexity for querying - it is just 
that the current situation is genuinely messier for consumers because there are 
two notations in play, whereas if RDF is so good we should have everything in 
RDF.
Not that I would say anything should change :-) it ain’t actually broken, but 
it could get fixed.

(Oh dear, Hugh showing his ignorance of the fancy stuff again)

Best
Hugh
 
 To give that, if we add interpretation of bNodes used in this value form 
 (datatype properties vs object properties ?), so you can ask about shared 
 values, we have made them tidy again.  But then it is little different from 
 structured literals with @lang and ^^datatype.
 
 Having the data model and the access model different does not gain anything.  
 The data model should reflect the way the data is accessed.
 
 Like RDF lists, or seq/alt/bag, encoding values in triples is attractive in 
 its uniformity but the triples nature always shows through somewhere, 
 making something else complicated.
 
   Andy
 
 PS Graph leaning does not help because you can't add data incrementally if 
 leaning is applied at each addition.
 
 I suggested way back these properties as a way of putting the info into the 
 graph
 but my suggestion was not adopted.  I think it would have made the model
 more complete which would have been a good think, though
 SPARQL would need to have language-independent query matching as a  special 
 case -- but
 it does now too really.
 
 (These are interpretation properties.  I must really update
 http://www.w3.org/DesignIssues/InterpretationProperties.html)
 
 Units are fun as properties too. http://www.w3.org/2007/ont/unit
 
 Tim
 
 
 Andy

-- 
Hugh Glaser
   20 Portchester Rise
   Eastleigh
   SO50 4QS
Mobile: +44 75 9533 4155, Home: +44 23 8061 5652





Re: Lang and dt in the graph. Was: Dumb SPARQL query problem

2013-12-01 Thread Ross Horne
Andy is right (as usual!). With the proposed bnode encoding, the graph
becomes fatter each time the same triple is loaded.

RDF 1.1 has just fixed the mess caused by blurring the roles of the lexer
and the parser, as summarised by David recently:
http://lists.w3.org/Archives/Public/public-lod/2013Nov/0093.html

Please don't get back into mixing up the lexer and the parser. The lexical
spaces of the basic datatypes are disjoint, so in any language we can just
write:
 - 999  instead of 999^^xsd:integer
 - 9.99 instead of 9.99^^xsd:decimal
 - WWV instead of WWV^^xsd:string
 - 2013-06-6T11:00:00+01:00 instead of
2013-06-6T11:00:00+01:00^^xsd:dateTime

As part of a compiler [1], a lexer gobbles up characters, e.g. 999, and
turns the characters into a token. A token consists of a string, called an
attribute value, plus a token name, e.g. 999^^xsd:integer. Only a
relatively small handful of people writing compilers for languages should
have to care about how tokens are represented, not end users of languages.

For language tags, a little simple conventional datatype subtyping (as
opposed to rdfs:subClassOf), could help the programmer further [2]. e.g. a
programmer that writes regex(WWV2013@en, WWV) clearly meant
regex(WWV2013, WWV) and shouldn't have to care about the distinction,
unless I am mistaken.

Regards,

Ross

[1] Ullman, Aho, Lam and Sethi. Compilers: principles, techniques and
tools. 1986
[2] Local Type Checking for Linked Data Consumers. http:/
dx.doi.org/10.4204/EPTCS.123.4


Re: Dumb SPARQL query problem

2013-11-24 Thread Gregg Kellogg
Relevant to RDF 1.1 support in SPARQL 1.1, I modified my local copy of the 
SPARQL 1.1 Query test suite for the result differences I noticed:

distinct-all.srx - Remove 3 literal results with 
datatype=http://www.w3.org/2001/XMLSchema#string;
distinct-str.srx - (same)
strdt03.srx - Remove datatype=http://www.w3.org/2001/XMLSchema#string; from 
literals
strlang03.srx - (same)

After that, I can continue to pass SPARQL 1.1 Query tests for my Ruby 
implementation.

Gregg Kellogg
gr...@greggkellogg.net

On Nov 23, 2013, at 11:25 AM, David Booth da...@dbooth.org wrote:

 On 11/23/2013 12:21 PM, Andy Seaborne wrote:
 
 
 On 23/11/13 17:01, David Booth wrote:
 Hi Hugh,
 
 A little correction and a further question . . .
 
 On 11/23/2013 10:17 AM, Hugh Glaser wrote:
 Pleasure.
 Actually, I found this:
 http://answers.semanticweb.com/questions/3530/sparql-query-filtering-by-string
 
 
 
 I said it is a pig’s breakfast because you never know what the RDF
 publisher has decided to do, and need to try everything.
 So to match strings efficiently you need to do (at least) four queries:
 “cat”
 “cat”@en
 “cat”^^xsd:string
 
 Is that still true in SPARQL 1.1?  In Turtle cat means the exact same
 thing as cat^^xsd:string:
 http://www.w3.org/TR/turtle/#literals
 
 But this section of SPARQL 1.1 Section 4.1.2 Syntax for Literals has
 no mention of them being the same:
 http://www.w3.org/TR/sparql11-query/#QSynLiterals
 
 Anyone (Andy?) know whether this was fixed in SPARQL 1.1?  I thought
 SPARQL 1.1 and Turtle had been pretty well aligned.
 
 SPARQL 1.1 says nothing about it aside from (as in SPARQL 1.0)
 DATATYPE(abc) is xsd:string and DATATYPE(abc@en) is rdf:langString
 (in 1.1).
 
 What it should say, but does not because SPARQL 1.1 finished before RDF
 1.1 got near sufficiently stable, is
 
 1/ parsing abc and abc^^xsd:string is the same thing.
 2/ In results formats, it's abc or equivalent, and no ^^xsd:String.
 
 For matching, it falls out in the matching over RDF but actually putting
 that in the text would be nice.
 
 Ah yes, I see that in the RDF 1.1 draft now:
 http://www.w3.org/TR/rdf11-concepts/#h3_section-Graph-Literal
 [[
 Concrete syntaxes MAY support simple literals, consisting of only a lexical 
 form without any datatype IRI or language tag. Simple literals only exist in 
 concrete syntaxes, and are treated as syntactic sugar for abstract syntax 
 literals with the datatype IRI http://www.w3.org/2001/XMLSchema#string.
 ]]
 
 So in effect, this was fixed at the RDF 1.1 abstract level, so even though 
 the SPARQL 1.1 spec did not mention it, if a SPARQL 1.1 server is RDF 1.1 
 compliant, then it will treat abc and abc^^xsd:string as the same.
 
 Thanks!
 David
 




Dumb SPARQL query problem

2013-11-23 Thread Richard Light

Hi,

Sorry to bother the list, but I'm stumped by what should be a simple 
SPARQL query.  When applied to the dbpedia end-point [1], this search:


PREFIX foaf: http://xmlns.com/foaf/0.1/
PREFIX dbpedia-owl: http://dbpedia.org/ontology/
SELECT *
WHERE {
?pers a foaf:Person .
?pers foaf:surname Malik .
OPTIONAL {?pers dbpedia-owl:birthDate ?dob }
OPTIONAL {?pers dbpedia-owl:deathDate ?dod }
OPTIONAL {?pers dbpedia-owl:placeOfBirth ?pob }
OPTIONAL {?pers dbpedia-owl:placeOfDeath ?pod }
}
LIMIT 100

yields no results. Yet if you drop the '?pers foaf:surname Malik .' 
clause, you get a result set which includes a Malik with the desired 
surname property.  I'm clearly being dumb, but in what way? :-)


(I've tried adding ^^xsd:string to the literal, but no joy.)

Thanks,

Richard
[1] http://dbpedia.org/sparql
--
*Richard Light*


Re: Dumb SPARQL query problem

2013-11-23 Thread Hugh Glaser
Its’ the other bit of the pig’s breakfast.
Try an @en

On 23 Nov 2013, at 10:18, Richard Light rich...@light.demon.co.uk wrote:

 Hi,
 
 Sorry to bother the list, but I'm stumped by what should be a simple SPARQL 
 query.  When applied to the dbpedia end-point [1], this search:
 
 PREFIX foaf: http://xmlns.com/foaf/0.1/
 PREFIX dbpedia-owl: http://dbpedia.org/ontology/
 SELECT *
 WHERE {
 ?pers a foaf:Person .
 ?pers foaf:surname Malik .
 OPTIONAL {?pers dbpedia-owl:birthDate ?dob }
 OPTIONAL {?pers dbpedia-owl:deathDate ?dod }
 OPTIONAL {?pers dbpedia-owl:placeOfBirth ?pob } 
 OPTIONAL {?pers dbpedia-owl:placeOfDeath ?pod } 
 }
 LIMIT 100
 
 yields no results. Yet if you drop the '?pers foaf:surname Malik .' clause, 
 you get a result set which includes a Malik with the desired surname 
 property.  I'm clearly being dumb, but in what way? :-) 
 
 (I've tried adding ^^xsd:string to the literal, but no joy.)
 
 Thanks,
 
 Richard
 [1] http://dbpedia.org/sparql
 -- 
 Richard Light

-- 
Hugh Glaser
   20 Portchester Rise
   Eastleigh
   SO50 4QS
Mobile: +44 75 9533 4155, Home: +44 23 8061 5652





Re: Dumb SPARQL query problem

2013-11-23 Thread Richard Light


On 23/11/2013 10:30, Hugh Glaser wrote:

Its’ the other bit of the pig’s breakfast.
Try an @en

Magic!  Thanks.

Richard


On 23 Nov 2013, at 10:18, Richard Light rich...@light.demon.co.uk wrote:


Hi,

Sorry to bother the list, but I'm stumped by what should be a simple SPARQL 
query.  When applied to the dbpedia end-point [1], this search:

PREFIX foaf: http://xmlns.com/foaf/0.1/
PREFIX dbpedia-owl: http://dbpedia.org/ontology/
SELECT *
WHERE {
 ?pers a foaf:Person .
 ?pers foaf:surname Malik .
 OPTIONAL {?pers dbpedia-owl:birthDate ?dob }
 OPTIONAL {?pers dbpedia-owl:deathDate ?dod }
 OPTIONAL {?pers dbpedia-owl:placeOfBirth ?pob }
 OPTIONAL {?pers dbpedia-owl:placeOfDeath ?pod }
}
LIMIT 100

yields no results. Yet if you drop the '?pers foaf:surname Malik .' clause, 
you get a result set which includes a Malik with the desired surname property.  I'm 
clearly being dumb, but in what way? :-)

(I've tried adding ^^xsd:string to the literal, but no joy.)

Thanks,

Richard
[1] http://dbpedia.org/sparql
--
Richard Light


--
*Richard Light*


Re: Dumb SPARQL query problem

2013-11-23 Thread Hugh Glaser
Pleasure.
Actually, I found this:
http://answers.semanticweb.com/questions/3530/sparql-query-filtering-by-string

I said it is a pig’s breakfast because you never know what the RDF publisher 
has decided to do, and need to try everything.
So to match strings efficiently you need to do (at least) four queries:
“cat”
“cat”@en
“cat”^^xsd:string
“cat”@en^^xsd:string or “cat”^^xsd:string@en - I can’t remember which is right, 
but I think it’s only one of them :-)

Of course if you are matching in SPARQL you can use “… ?o . FILTER (str(?o) = 
“cat”)…”, but that its likely to be much slower.

This means that you may need to do a lot of queries.
I built something to look for matching strings (of course! - finding sameAs 
candidates) where the RDF had been gathered from different sources.
Something like
SELECT ?a ?b WHERE { ?a ?p1 ?s . ?b ?p2 ?s }
would have been nice.
I’ll leave it as an exercise to the reader to work out how many queries it 
takes to genuinely achieve the desired effect without using FILTER and str.

Unfortunately it seems that recent developments have not been much help here, 
but I may be wrong:
http://www.w3.org/TR/sparql11-query/#matchingRDFLiterals

I guess that the truth is that other people don’t actually build systems that 
follow your nose to arbitrary Linked Data resources, so they don’t worry about 
it?
Or am I missing something obvious, and people actually have a good way around 
this?

To me the problem all comes because knowledge is being represented outside the 
triple model.
And also because of the XML legacy of RDF, even though everyone keeps saying 
that is only a serialisation of an abstract model.
Ah well, back in my box.

Cheers.

On 23 Nov 2013, at 11:00, Richard Light rich...@light.demon.co.uk wrote:

 
 On 23/11/2013 10:30, Hugh Glaser wrote:
 Its’ the other bit of the pig’s breakfast.
 Try an @en
 
 Magic!  Thanks.
 
 Richard
 On 23 Nov 2013, at 10:18, Richard Light rich...@light.demon.co.uk
  wrote:
 
 
 Hi,
 
 Sorry to bother the list, but I'm stumped by what should be a simple SPARQL 
 query.  When applied to the dbpedia end-point [1], this search:
 
 PREFIX foaf: 
 http://xmlns.com/foaf/0.1/
 
 PREFIX dbpedia-owl: 
 http://dbpedia.org/ontology/
 
 SELECT *
 WHERE {
 ?pers a foaf:Person .
 ?pers foaf:surname Malik .
 OPTIONAL {?pers dbpedia-owl:birthDate ?dob }
 OPTIONAL {?pers dbpedia-owl:deathDate ?dod }
 OPTIONAL {?pers dbpedia-owl:placeOfBirth ?pob } 
 OPTIONAL {?pers dbpedia-owl:placeOfDeath ?pod } 
 }
 LIMIT 100
 
 yields no results. Yet if you drop the '?pers foaf:surname Malik .' 
 clause, you get a result set which includes a Malik with the desired 
 surname property.  I'm clearly being dumb, but in what way? :-) 
 
 (I've tried adding ^^xsd:string to the literal, but no joy.)
 
 Thanks,
 
 Richard
 [1] 
 http://dbpedia.org/sparql
 
 -- 
 Richard Light
 
 
 -- 
 Richard Light

-- 
Hugh Glaser
   20 Portchester Rise
   Eastleigh
   SO50 4QS
Mobile: +44 75 9533 4155, Home: +44 23 8061 5652





Re: Dumb SPARQL query problem

2013-11-23 Thread Gannon Dick
Not sure if this helps multilingual pigs as much as it should, but I'm not much 
good before coffee and expect there are many fellow mammals who share my plight 
...

Language classification code reduction (in old fashioned SQL)
http://www.rustprivacy.org/faca/languages.php



On Sat, 11/23/13, Hugh Glaser h...@glasers.org wrote:

 Subject: Re: Dumb SPARQL query problem
 To: Richard Light rich...@light.demon.co.uk
 Cc: public-lod community public-lod@w3.org
 Date: Saturday, November 23, 2013, 9:17 AM
 
 Pleasure.
 Actually, I found this:
 http://answers.semanticweb.com/questions/3530/sparql-query-filtering-by-string
 
 I said it is a pig’s breakfast because you never know what
 the RDF publisher has decided to do, and need to try
 everything.
 So to match strings efficiently you need to do (at least)
 four queries:
 “cat”
 “cat”@en
 “cat”^^xsd:string
 “cat”@en^^xsd:string or “cat”^^xsd:string@en - I
 can’t remember which is right, but I think it’s only one
 of them :-)
 
 Of course if you are matching in SPARQL you can use “…
 ?o . FILTER (str(?o) = “cat”)…”, but that its likely
 to be much slower.
 
 This means that you may need to do a lot of queries.
 I built something to look for matching strings (of course! -
 finding sameAs candidates) where the RDF had been gathered
 from different sources.
 Something like
 SELECT ?a ?b WHERE { ?a ?p1 ?s . ?b ?p2 ?s }
 would have been nice.
 I’ll leave it as an exercise to the reader to work out how
 many queries it takes to genuinely achieve the desired
 effect without using FILTER and str.
 
 Unfortunately it seems that recent developments have not
 been much help here, but I may be wrong:
 http://www.w3.org/TR/sparql11-query/#matchingRDFLiterals
 
 I guess that the truth is that other people don’t actually
 build systems that follow your nose to arbitrary Linked Data
 resources, so they don’t worry about it?
 Or am I missing something obvious, and people actually have
 a good way around this?
 
 To me the problem all comes because knowledge is being
 represented outside the triple model.
 And also because of the XML legacy of RDF, even though
 everyone keeps saying that is only a serialisation of an
 abstract model.
 Ah well, back in my box.
 
 Cheers.
 
 On 23 Nov 2013, at 11:00, Richard Light rich...@light.demon.co.uk
 wrote:
 
  
  On 23/11/2013 10:30, Hugh Glaser wrote:
  Its’ the other bit of the pig’s breakfast.
  Try an @en
  
  Magic!  Thanks.
  
  Richard
  On 23 Nov 2013, at 10:18, Richard Light rich...@light.demon.co.uk
   wrote:
  
  
  Hi,
  
  Sorry to bother the list, but I'm stumped by
 what should be a simple SPARQL query.  When applied to
 the dbpedia end-point [1], this search:
  
  PREFIX foaf: 
  http://xmlns.com/foaf/0.1/
  
  PREFIX dbpedia-owl: 
  http://dbpedia.org/ontology/
  
  SELECT *
  WHERE {
      ?pers a foaf:Person .
      ?pers foaf:surname
 Malik .
      OPTIONAL {?pers
 dbpedia-owl:birthDate ?dob }
      OPTIONAL {?pers
 dbpedia-owl:deathDate ?dod }
      OPTIONAL {?pers
 dbpedia-owl:placeOfBirth ?pob } 
      OPTIONAL {?pers
 dbpedia-owl:placeOfDeath ?pod } 
  }
  LIMIT 100
  
  yields no results. Yet if you drop the '?pers
 foaf:surname Malik .' clause, you get a result set which
 includes a Malik with the desired surname property. 
 I'm clearly being dumb, but in what way? :-) 
  
  (I've tried adding ^^xsd:string to the literal,
 but no joy.)
  
  Thanks,
  
  Richard
  [1] 
  http://dbpedia.org/sparql
  
  -- 
  Richard Light
  
  
  -- 
  Richard Light
 
 -- 
 Hugh Glaser
    20 Portchester Rise
    Eastleigh
    SO50 4QS
 Mobile: +44 75 9533 4155, Home: +44 23 8061 5652
 
 
 




Re: Dumb SPARQL query problem

2013-11-23 Thread David Booth

Hi Hugh,

A little correction and a further question . . .

On 11/23/2013 10:17 AM, Hugh Glaser wrote:

Pleasure.
Actually, I found this:
http://answers.semanticweb.com/questions/3530/sparql-query-filtering-by-string

I said it is a pig’s breakfast because you never know what the RDF publisher 
has decided to do, and need to try everything.
So to match strings efficiently you need to do (at least) four queries:
“cat”
“cat”@en
“cat”^^xsd:string


Is that still true in SPARQL 1.1?  In Turtle cat means the exact same 
thing as cat^^xsd:string:

http://www.w3.org/TR/turtle/#literals

But this section of SPARQL 1.1 Section 4.1.2 Syntax for Literals has 
no mention of them being the same:

http://www.w3.org/TR/sparql11-query/#QSynLiterals

Anyone (Andy?) know whether this was fixed in SPARQL 1.1?  I thought 
SPARQL 1.1 and Turtle had been pretty well aligned.



“cat”@en^^xsd:string or “cat”^^xsd:string@en - I can’t remember which is right, 
but I think it’s only one of them :-)


Neither is allowed.  You can have *either* a language tag *or* a 
datatype, but not both:

http://www.w3.org/TR/sparql11-query/#QSynLiterals
http://www.w3.org/TR/sparql11-query/#rRDFLiteral

But dealing with the difference between cat and cat@en is still a 
problem, as explained here:

http://www.w3.org/TR/sparql11-query/#matchLangTags

This would have been fixed if the RDF model had been changed to 
represent the language tag as an additional triple, but whether this 
would have been a net benefit to the community is still an open 
question, as it would add the complexity of additional triples.


David



Of course if you are matching in SPARQL you can use “… ?o . FILTER (str(?o) = 
“cat”)…”, but that its likely to be much slower.

This means that you may need to do a lot of queries.
I built something to look for matching strings (of course! - finding sameAs 
candidates) where the RDF had been gathered from different sources.
Something like
SELECT ?a ?b WHERE { ?a ?p1 ?s . ?b ?p2 ?s }
would have been nice.
I’ll leave it as an exercise to the reader to work out how many queries it 
takes to genuinely achieve the desired effect without using FILTER and str.

Unfortunately it seems that recent developments have not been much help here, 
but I may be wrong:
http://www.w3.org/TR/sparql11-query/#matchingRDFLiterals

I guess that the truth is that other people don’t actually build systems that 
follow your nose to arbitrary Linked Data resources, so they don’t worry about 
it?
Or am I missing something obvious, and people actually have a good way around 
this?

To me the problem all comes because knowledge is being represented outside the 
triple model.
And also because of the XML legacy of RDF, even though everyone keeps saying 
that is only a serialisation of an abstract model.
Ah well, back in my box.

Cheers.

On 23 Nov 2013, at 11:00, Richard Light rich...@light.demon.co.uk wrote:



On 23/11/2013 10:30, Hugh Glaser wrote:

Its’ the other bit of the pig’s breakfast.
Try an @en


Magic!  Thanks.

Richard

On 23 Nov 2013, at 10:18, Richard Light rich...@light.demon.co.uk
  wrote:



Hi,

Sorry to bother the list, but I'm stumped by what should be a simple SPARQL 
query.  When applied to the dbpedia end-point [1], this search:

PREFIX foaf:
http://xmlns.com/foaf/0.1/

PREFIX dbpedia-owl:
http://dbpedia.org/ontology/

SELECT *
WHERE {
 ?pers a foaf:Person .
 ?pers foaf:surname Malik .
 OPTIONAL {?pers dbpedia-owl:birthDate ?dob }
 OPTIONAL {?pers dbpedia-owl:deathDate ?dod }
 OPTIONAL {?pers dbpedia-owl:placeOfBirth ?pob }
 OPTIONAL {?pers dbpedia-owl:placeOfDeath ?pod }
}
LIMIT 100

yields no results. Yet if you drop the '?pers foaf:surname Malik .' clause, 
you get a result set which includes a Malik with the desired surname property.  I'm 
clearly being dumb, but in what way? :-)

(I've tried adding ^^xsd:string to the literal, but no joy.)

Thanks,

Richard
[1]
http://dbpedia.org/sparql

--
Richard Light



--
Richard Light






Re: Dumb SPARQL query problem

2013-11-23 Thread Andy Seaborne



On 23/11/13 17:01, David Booth wrote:

Hi Hugh,

A little correction and a further question . . .

On 11/23/2013 10:17 AM, Hugh Glaser wrote:

Pleasure.
Actually, I found this:
http://answers.semanticweb.com/questions/3530/sparql-query-filtering-by-string


I said it is a pig’s breakfast because you never know what the RDF
publisher has decided to do, and need to try everything.
So to match strings efficiently you need to do (at least) four queries:
“cat”
“cat”@en
“cat”^^xsd:string


Is that still true in SPARQL 1.1?  In Turtle cat means the exact same
thing as cat^^xsd:string:
http://www.w3.org/TR/turtle/#literals

But this section of SPARQL 1.1 Section 4.1.2 Syntax for Literals has
no mention of them being the same:
http://www.w3.org/TR/sparql11-query/#QSynLiterals

Anyone (Andy?) know whether this was fixed in SPARQL 1.1?  I thought
SPARQL 1.1 and Turtle had been pretty well aligned.


SPARQL 1.1 says nothing about it aside from (as in SPARQL 1.0) 
DATATYPE(abc) is xsd:string and DATATYPE(abc@en) is rdf:langString 
(in 1.1).


What it should say, but does not because SPARQL 1.1 finished before RDF 
1.1 got near sufficiently stable, is


1/ parsing abc and abc^^xsd:string is the same thing.
2/ In results formats, it's abc or equivalent, and no ^^xsd:String.

For matching, it falls out in the matching over RDF but actually putting 
that in the text would be nice.




“cat”@en^^xsd:string or “cat”^^xsd:string@en - I can’t remember which
is right, but I think it’s only one of them :-)


Neither is allowed.  You can have *either* a language tag *or* a
datatype, but not both:
http://www.w3.org/TR/sparql11-query/#QSynLiterals
http://www.w3.org/TR/sparql11-query/#rRDFLiteral


Ditto in RDF syntax.



But dealing with the difference between cat and cat@en is still a
problem, as explained here:
http://www.w3.org/TR/sparql11-query/#matchLangTags

This would have been fixed if the RDF model had been changed to
represent the language tag as an additional triple, but whether this
would have been a net benefit to the community is still an open
question, as it would add the complexity of additional triples.


Different.  Maybe better, maybe worse.


Do you want all your abc to be the same language?

   abc rdf:lang en .

or multiple languages:

   abc rdf:lang cy .
   abc rdf:lang en .


?

Unlikely - so it's bnode time ...

:x :p [ rdf:value abc ; rdf:lang en ] .

Andy




David



Of course if you are matching in SPARQL you can use “… ?o . FILTER
(str(?o) = “cat”)…”, but that its likely to be much slower.

This means that you may need to do a lot of queries.
I built something to look for matching strings (of course! - finding
sameAs candidates) where the RDF had been gathered from different
sources.
Something like
SELECT ?a ?b WHERE { ?a ?p1 ?s . ?b ?p2 ?s }
would have been nice.
I’ll leave it as an exercise to the reader to work out how many
queries it takes to genuinely achieve the desired effect without using
FILTER and str.

Unfortunately it seems that recent developments have not been much
help here, but I may be wrong:
http://www.w3.org/TR/sparql11-query/#matchingRDFLiterals

I guess that the truth is that other people don’t actually build
systems that follow your nose to arbitrary Linked Data resources, so
they don’t worry about it?
Or am I missing something obvious, and people actually have a good way
around this?

To me the problem all comes because knowledge is being represented
outside the triple model.
And also because of the XML legacy of RDF, even though everyone keeps
saying that is only a serialisation of an abstract model.
Ah well, back in my box.

Cheers.

On 23 Nov 2013, at 11:00, Richard Light rich...@light.demon.co.uk
wrote:



On 23/11/2013 10:30, Hugh Glaser wrote:

Its’ the other bit of the pig’s breakfast.
Try an @en


Magic!  Thanks.

Richard

On 23 Nov 2013, at 10:18, Richard Light rich...@light.demon.co.uk
  wrote:



Hi,

Sorry to bother the list, but I'm stumped by what should be a
simple SPARQL query.  When applied to the dbpedia end-point [1],
this search:

PREFIX foaf:
http://xmlns.com/foaf/0.1/

PREFIX dbpedia-owl:
http://dbpedia.org/ontology/

SELECT *
WHERE {
 ?pers a foaf:Person .
 ?pers foaf:surname Malik .
 OPTIONAL {?pers dbpedia-owl:birthDate ?dob }
 OPTIONAL {?pers dbpedia-owl:deathDate ?dod }
 OPTIONAL {?pers dbpedia-owl:placeOfBirth ?pob }
 OPTIONAL {?pers dbpedia-owl:placeOfDeath ?pod }
}
LIMIT 100

yields no results. Yet if you drop the '?pers foaf:surname Malik
.' clause, you get a result set which includes a Malik with the
desired surname property.  I'm clearly being dumb, but in what way?
:-)

(I've tried adding ^^xsd:string to the literal, but no joy.)

Thanks,

Richard
[1]
http://dbpedia.org/sparql

--
Richard Light



--
Richard Light








Re: Dumb SPARQL query problem

2013-11-23 Thread David Booth

On 11/23/2013 12:21 PM, Andy Seaborne wrote:



On 23/11/13 17:01, David Booth wrote:

Hi Hugh,

A little correction and a further question . . .

On 11/23/2013 10:17 AM, Hugh Glaser wrote:

Pleasure.
Actually, I found this:
http://answers.semanticweb.com/questions/3530/sparql-query-filtering-by-string



I said it is a pig’s breakfast because you never know what the RDF
publisher has decided to do, and need to try everything.
So to match strings efficiently you need to do (at least) four queries:
“cat”
“cat”@en
“cat”^^xsd:string


Is that still true in SPARQL 1.1?  In Turtle cat means the exact same
thing as cat^^xsd:string:
http://www.w3.org/TR/turtle/#literals

But this section of SPARQL 1.1 Section 4.1.2 Syntax for Literals has
no mention of them being the same:
http://www.w3.org/TR/sparql11-query/#QSynLiterals

Anyone (Andy?) know whether this was fixed in SPARQL 1.1?  I thought
SPARQL 1.1 and Turtle had been pretty well aligned.


SPARQL 1.1 says nothing about it aside from (as in SPARQL 1.0)
DATATYPE(abc) is xsd:string and DATATYPE(abc@en) is rdf:langString
(in 1.1).

What it should say, but does not because SPARQL 1.1 finished before RDF
1.1 got near sufficiently stable, is

1/ parsing abc and abc^^xsd:string is the same thing.
2/ In results formats, it's abc or equivalent, and no ^^xsd:String.

For matching, it falls out in the matching over RDF but actually putting
that in the text would be nice.


Ah yes, I see that in the RDF 1.1 draft now:
http://www.w3.org/TR/rdf11-concepts/#h3_section-Graph-Literal
[[
Concrete syntaxes MAY support simple literals, consisting of only a 
lexical form without any datatype IRI or language tag. Simple literals 
only exist in concrete syntaxes, and are treated as syntactic sugar for 
abstract syntax literals with the datatype IRI 
http://www.w3.org/2001/XMLSchema#string.

]]

So in effect, this was fixed at the RDF 1.1 abstract level, so even 
though the SPARQL 1.1 spec did not mention it, if a SPARQL 1.1 server is 
RDF 1.1 compliant, then it will treat abc and abc^^xsd:string as the 
same.


Thanks!
David