Re: [Owlim-discussion] {Disarmed} Re: missing owl:sameAs triples

2012-12-19 Thread Jerven Bolleman

Hi Barry,

The workbench is not involved its just using directly the sesame api.

e.g.
((GraphQuery) query).evaluate();

As the triple is explicit in the dataset. Which means setIncludeInferred
true or false should not matter. Testing with the debugger it is unaffected.

Tobe clear we load the data using ruleset=none. Which might impact the
scenario.

The query used to work its just in the latest version that I notice it 
being gone.


Regards,
Jerven

On 12/19/2012 11:03 AM, Barry Norton wrote:


Jerven, using owlim-se-5.3.5689 I just executed:

$ wget http://www.uniprot.org/uniprot/P68353.nt
$ curl -X POST -H Content-Type:text/turtle -T P68353.nt
http://localhost:8080/openrdf-sesame/repositories/uniprot/statements
$ curl -d query=DESCRIBE http://purl.uniprot.org/intact/EBI-530932
http://localhost:8080/openrdf-sesame/repositories/uniprot

I got:
@prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# .
@prefix rdfs: http://www.w3.org/2000/01/rdf-schema# .
@prefix sesame: http://www.openrdf.org/schema/sesame# .
@prefix owl: http://www.w3.org/2002/07/owl# .
@prefix xsd: http://www.w3.org/2001/XMLSchema# .
@prefix fn: http://www.w3.org/2005/xpath-functions# .

{
http://purl.uniprot.org/intact/EBI-530932 a
http://purl.uniprot.org/core/Participant ;
 owl:sameAs http://purl.uniprot.org/uniprot/P68359 ,
http://purl.uniprot.org/intact/EBI-530932 ;
 rdfs:label CSN4 .

*http://purl.uniprot.org/uniprot/P68359 owl:sameAs
http://purl.uniprot.org/intact/EBI-530932 .**

**http://purl.uniprot.org/intact/EBI-530932 owl:sameAs
http://purl.uniprot.org/intact/EBI-530932 .**
*
 foo:bar#_5036383335330019
http://purl.uniprot.org/core/participant
http://purl.uniprot.org/intact/EBI-530932 .
}

Can you provide any more details that might help to reproduce your problem?

Could it simply be your parameter _equivalent=false (I believe there's a
checkbox in the UI), as the database clearly has triples expressing this
equivalence?

Barry



On 19/12/12 09:52, Jerven Bolleman wrote:

Dear ontotext developers,

I am missing a lot of owl:sameAs statements in our data.
I can reproduce the problem both on the linkedlifedata.com as at
beta.sparql.uniprot.org

The version is OwlimSchemaRepository: version: 5.3, revision: 5689

An example is
*MailScanner has detected a possible fraud attempt from
linkedlifedata.com claiming to be*
http://linkedlifedata.com/sparql?query=DESCRIBE+%3Chttp%3A%2F%2Fpurl.uniprot.org%2Fintact%2FEBI-530927%3E_implicit=falseimplicit=true_equivalent=false_form=%2Fsparql


The following triple is missing
http://purl.uniprot.org/intact/EBI-530932
http://www.w3.org/2002/07/owl#sameAs
http://purl.uniprot.org/uniprot/P68359 .

Which you can find in the source data i.e.
http://www.uniprot.org/uniprot/P68353.nt

This is messing up quite a few queries on our end could you please look
into this as soon as possible. This also affects select and ask
queries so its not just describe.

Regards,
Jerven





___
Owlim-discussion mailing list
Owlim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/owlim-discussion




--
---
 Jerven Bollemanjerven.bolle...@isb-sib.ch
 SIB Swiss Institute of Bioinformatics  Tel: +41 (0)22 379 58 85
 CMU, rue Michel Servet 1   Fax: +41 (0)22 379 58 58
 1211 Geneve 4,
 Switzerland www.isb-sib.ch - www.uniprot.org
 Follow us at https://twitter.com/#!/uniprot
---
___
Owlim-discussion mailing list
Owlim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/owlim-discussion


Re: [Owlim-discussion] {Disarmed} Re: missing owl:sameAs triples

2012-12-19 Thread Barry Norton


Thanks for the clarification - I believe Vassil has answered the 
question in the mean time, this is down to the way the data is loaded 
and exposed in LinkedLifeData.


OWLIM does index the equivalence in the raw dataset and, under 
appropriate conditions, will infer the symmetric dual.


Barry



On 19/12/12 10:24, Jerven Bolleman wrote:

Hi Barry,

The workbench is not involved its just using directly the sesame api.

e.g.
((GraphQuery) query).evaluate();

As the triple is explicit in the dataset. Which means setIncludeInferred
true or false should not matter. Testing with the debugger it is 
unaffected.


Tobe clear we load the data using ruleset=none. Which might impact the
scenario.

The query used to work its just in the latest version that I notice it 
being gone.


Regards,
Jerven

On 12/19/2012 11:03 AM, Barry Norton wrote:


Jerven, using owlim-se-5.3.5689 I just executed:

$ wget http://www.uniprot.org/uniprot/P68353.nt
$ curl -X POST -H Content-Type:text/turtle -T P68353.nt
http://localhost:8080/openrdf-sesame/repositories/uniprot/statements
$ curl -d query=DESCRIBE http://purl.uniprot.org/intact/EBI-530932
http://localhost:8080/openrdf-sesame/repositories/uniprot

I got:
@prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# .
@prefix rdfs: http://www.w3.org/2000/01/rdf-schema# .
@prefix sesame: http://www.openrdf.org/schema/sesame# .
@prefix owl: http://www.w3.org/2002/07/owl# .
@prefix xsd: http://www.w3.org/2001/XMLSchema# .
@prefix fn: http://www.w3.org/2005/xpath-functions# .

{
http://purl.uniprot.org/intact/EBI-530932 a
http://purl.uniprot.org/core/Participant ;
 owl:sameAs http://purl.uniprot.org/uniprot/P68359 ,
http://purl.uniprot.org/intact/EBI-530932 ;
 rdfs:label CSN4 .

*http://purl.uniprot.org/uniprot/P68359 owl:sameAs
http://purl.uniprot.org/intact/EBI-530932 .**

**http://purl.uniprot.org/intact/EBI-530932 owl:sameAs
http://purl.uniprot.org/intact/EBI-530932 .**
*
 foo:bar#_5036383335330019
http://purl.uniprot.org/core/participant
http://purl.uniprot.org/intact/EBI-530932 .
}

Can you provide any more details that might help to reproduce your 
problem?


Could it simply be your parameter _equivalent=false (I believe there's a
checkbox in the UI), as the database clearly has triples expressing this
equivalence?

Barry



On 19/12/12 09:52, Jerven Bolleman wrote:

Dear ontotext developers,

I am missing a lot of owl:sameAs statements in our data.
I can reproduce the problem both on the linkedlifedata.com as at
beta.sparql.uniprot.org

The version is OwlimSchemaRepository: version: 5.3, revision: 5689

An example is
*MailScanner has detected a possible fraud attempt from
linkedlifedata.com claiming to be*
http://linkedlifedata.com/sparql?query=DESCRIBE+%3Chttp%3A%2F%2Fpurl.uniprot.org%2Fintact%2FEBI-530927%3E_implicit=falseimplicit=true_equivalent=false_form=%2Fsparql 




The following triple is missing
http://purl.uniprot.org/intact/EBI-530932
http://www.w3.org/2002/07/owl#sameAs
http://purl.uniprot.org/uniprot/P68359 .

Which you can find in the source data i.e.
http://www.uniprot.org/uniprot/P68353.nt

This is messing up quite a few queries on our end could you please look
into this as soon as possible. This also affects select and ask
queries so its not just describe.

Regards,
Jerven





___
Owlim-discussion mailing list
Owlim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/owlim-discussion






___
Owlim-discussion mailing list
Owlim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/owlim-discussion


Re: [Owlim-discussion] {Disarmed} Re: missing owl:sameAs triples

2012-12-19 Thread Jerven Bolleman

Hi Barry,

No there are two systems which show the same problem.
All triples with owl:sameAs as predicate are gone.

i.e.
PREFIX owl: http://www.w3.org/2002/07/owl#
SELECT * WHERE {?s owl:sameAs ?o}
returns no results. Both at linkedlifedata and at
beta.sparql.uniprot.org.

Apparently Vassil claims that linkedlifedata deletes all owl:sameAs 
statements and therefore should not return results. This is silly but 
ok, if you really do that then you can not replicate my problem at that 
endpoint.


However, we at uniprot do not remove any owl:sameAs statement. We load 
these into the store and then do not get them back in query answering.


Basically all triples with owl:sameAs are silently deleted by the store.

A dataset that you can use for experimenting is 
ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/citations.rdf.gz


Which is moderate in size.

Regards,
Jerven


On 12/19/2012 11:38 AM, Barry Norton wrote:


Thanks for the clarification - I believe Vassil has answered the
question in the mean time, this is down to the way the data is loaded
and exposed in LinkedLifeData.

OWLIM does index the equivalence in the raw dataset and, under
appropriate conditions, will infer the symmetric dual.

Barry



On 19/12/12 10:24, Jerven Bolleman wrote:

Hi Barry,

The workbench is not involved its just using directly the sesame api.

e.g.
((GraphQuery) query).evaluate();

As the triple is explicit in the dataset. Which means setIncludeInferred
true or false should not matter. Testing with the debugger it is
unaffected.

Tobe clear we load the data using ruleset=none. Which might impact the
scenario.

The query used to work its just in the latest version that I notice it
being gone.

Regards,
Jerven

On 12/19/2012 11:03 AM, Barry Norton wrote:


Jerven, using owlim-se-5.3.5689 I just executed:

$ wget http://www.uniprot.org/uniprot/P68353.nt
$ curl -X POST -H Content-Type:text/turtle -T P68353.nt
http://localhost:8080/openrdf-sesame/repositories/uniprot/statements
$ curl -d query=DESCRIBE http://purl.uniprot.org/intact/EBI-530932
http://localhost:8080/openrdf-sesame/repositories/uniprot

I got:
@prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# .
@prefix rdfs: http://www.w3.org/2000/01/rdf-schema# .
@prefix sesame: http://www.openrdf.org/schema/sesame# .
@prefix owl: http://www.w3.org/2002/07/owl# .
@prefix xsd: http://www.w3.org/2001/XMLSchema# .
@prefix fn: http://www.w3.org/2005/xpath-functions# .

{
http://purl.uniprot.org/intact/EBI-530932 a
http://purl.uniprot.org/core/Participant ;
 owl:sameAs http://purl.uniprot.org/uniprot/P68359 ,
http://purl.uniprot.org/intact/EBI-530932 ;
 rdfs:label CSN4 .

*http://purl.uniprot.org/uniprot/P68359 owl:sameAs
http://purl.uniprot.org/intact/EBI-530932 .**

**http://purl.uniprot.org/intact/EBI-530932 owl:sameAs
http://purl.uniprot.org/intact/EBI-530932 .**
*
 foo:bar#_5036383335330019
http://purl.uniprot.org/core/participant
http://purl.uniprot.org/intact/EBI-530932 .
}

Can you provide any more details that might help to reproduce your
problem?

Could it simply be your parameter _equivalent=false (I believe there's a
checkbox in the UI), as the database clearly has triples expressing this
equivalence?

Barry



On 19/12/12 09:52, Jerven Bolleman wrote:

Dear ontotext developers,

I am missing a lot of owl:sameAs statements in our data.
I can reproduce the problem both on the linkedlifedata.com as at
beta.sparql.uniprot.org

The version is OwlimSchemaRepository: version: 5.3, revision: 5689

An example is
*MailScanner has detected a possible fraud attempt from
linkedlifedata.com claiming to be*
http://linkedlifedata.com/sparql?query=DESCRIBE+%3Chttp%3A%2F%2Fpurl.uniprot.org%2Fintact%2FEBI-530927%3E_implicit=falseimplicit=true_equivalent=false_form=%2Fsparql



The following triple is missing
http://purl.uniprot.org/intact/EBI-530932
http://www.w3.org/2002/07/owl#sameAs
http://purl.uniprot.org/uniprot/P68359 .

Which you can find in the source data i.e.
http://www.uniprot.org/uniprot/P68353.nt

This is messing up quite a few queries on our end could you please look
into this as soon as possible. This also affects select and ask
queries so its not just describe.

Regards,
Jerven





___
Owlim-discussion mailing list
Owlim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/owlim-discussion






___
Owlim-discussion mailing list
Owlim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/owlim-discussion



--
---
 Jerven Bollemanjerven.bolle...@isb-sib.ch
 SIB Swiss Institute of Bioinformatics  Tel: +41 (0)22 379 58 85
 CMU, rue Michel Servet 1   Fax: +41 (0)22 379 58 58
 1211 Geneve 4,
 Switzerland www.isb-sib.ch - www.uniprot.org
 Follow us at https://twitter.com/#!/uniprot

Re: [Owlim-discussion] {Disarmed} Re: missing owl:sameAs triples

2012-12-19 Thread Jerven Bolleman

On 12/19/2012 12:01 PM, Vassil Momtchev wrote:

Jerven,

May I rephrase your issue and double check if I correctly understood it.

After upgrading to OWLIM version 5.3 revision 5689 you observed missing
sameAs triples. The system to reproduce this behavior is
beta.sparql.uniprot.org.

Yes.


I see a difference in the posted resources P68359 vs P68353 (is it
sameAs the link between the two resources?). Could you clarify the
following points:

  * What's the missing statement?

Lets use a simpler example
http://beta.sparql.uniprot.org/sparql?query=DESCRIBE+%3Chttp%3A%2F%2Fpurl.uniprot.org%2Fcitations%2F19660865%3E

i.e.
DESCRIBE http://purl.uniprot.org/citations/19660865
with source data
http://www.uniprot.org/citations/19660865.nt

Misses
http://purl.uniprot.org/citations/19660865 
http://www.w3.org/2002/07/owl#sameAs 
http://www.ncbi.nlm.nih.gov/pubmed/19660865

once loaded into OWLIM


  * Is it only a single triple or all statement with owl:sameAs predicate?

all triples with the owl:sameAs predicate

  * What's the original resource loaded to OWLIM?

ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/citations.rdf.gz
Which contains the owl:sameAs links
This is enough to demonstrate the problem.

  * Did you use the same revision to load the data or you simply updated
the OWLIM database?

Same revision. Started totally from scratch.

Regards,
Jerven


Thanks,
V.

On 12/19/2012 11:52 AM, Jerven Bolleman wrote:

Dear ontotext developers,

I am missing a lot of owl:sameAs statements in our data.
I can reproduce the problem both on the linkedlifedata.com as at
beta.sparql.uniprot.org

The version is OwlimSchemaRepository: version: 5.3, revision: 5689

An example is
*MailScanner has detected a possible fraud attempt from
linkedlifedata.com claiming to be*
http://linkedlifedata.com/sparql?query=DESCRIBE+%3Chttp%3A%2F%2Fpurl.uniprot.org%2Fintact%2FEBI-530927%3E_implicit=falseimplicit=true_equivalent=false_form=%2Fsparql


The following triple is missing
http://purl.uniprot.org/intact/EBI-530932
http://www.w3.org/2002/07/owl#sameAs
http://purl.uniprot.org/uniprot/P68359 .

Which you can find in the source data i.e.
http://www.uniprot.org/uniprot/P68353.nt

This is messing up quite a few queries on our end could you please look
into this as soon as possible. This also affects select and ask
queries so its not just describe.

Regards,
Jerven




--

--
Vassil Momtchev
Head of Life Science RD of Ontotext AD
http://www.ontotext.com
Skype: vassil_momtchev
--



___
Owlim-discussion mailing list
Owlim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/owlim-discussion




--
---
 Jerven Bollemanjerven.bolle...@isb-sib.ch
 SIB Swiss Institute of Bioinformatics  Tel: +41 (0)22 379 58 85
 CMU, rue Michel Servet 1   Fax: +41 (0)22 379 58 58
 1211 Geneve 4,
 Switzerland www.isb-sib.ch - www.uniprot.org
 Follow us at https://twitter.com/#!/uniprot
---
___
Owlim-discussion mailing list
Owlim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/owlim-discussion


Re: [Owlim-discussion] {Disarmed} Re: missing owl:sameAs triples

2012-12-19 Thread Barry Norton


Jerven, confirmed I can reproduce your problem thus:

$ wget 
ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/citations.rdf.gz

$ gunzip citations.rdf.gz
$ curl -X POST -H Content-Type:application/rdf+xml -T citations.rdf 
http://localhost:8080/openrdf-sesame/repositories/citations/statements


And now (unlike with P68353.nt) get only:
$ curl -d query=DESCRIBE http://purl.uniprot.org/uniprot/P68359 
http://localhost:8080/openrdf-sesame/repositories/citations

@prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# .
@prefix rdfs: http://www.w3.org/2000/01/rdf-schema# .
@prefix sesame: http://www.openrdf.org/schema/sesame# .
@prefix owl: http://www.w3.org/2002/07/owl# .
@prefix xsd: http://www.w3.org/2001/XMLSchema# .
@prefix fn: http://www.w3.org/2005/xpath-functions# .

[no results]

And:

$ curl -H Accept:application/sparql-results+json -d query=PREFIX owl: 
http://www.w3.org/2002/07/owl# SELECT * WHERE {?s owl:sameAs ?o} LIMIT 
10 http://localhost:8080/openrdf-sesame/repositories/citations

{
head: {
vars: [ s, o ]
},
results: {
bindings: [

]
}

[again, nothing]

Barry



On 19/12/12 11:50, Jerven Bolleman wrote:

Hi Barry,

No there are two systems which show the same problem.
All triples with owl:sameAs as predicate are gone.

i.e.
PREFIX owl: http://www.w3.org/2002/07/owl#
SELECT * WHERE {?s owl:sameAs ?o}
returns no results. Both at linkedlifedata and at
beta.sparql.uniprot.org.

Apparently Vassil claims that linkedlifedata deletes all owl:sameAs 
statements and therefore should not return results. This is silly but 
ok, if you really do that then you can not replicate my problem at 
that endpoint.


However, we at uniprot do not remove any owl:sameAs statement. We load 
these into the store and then do not get them back in query answering.


Basically all triples with owl:sameAs are silently deleted by the store.

A dataset that you can use for experimenting is 
ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/citations.rdf.gz


Which is moderate in size.

Regards,
Jerven


On 12/19/2012 11:38 AM, Barry Norton wrote:


Thanks for the clarification - I believe Vassil has answered the
question in the mean time, this is down to the way the data is loaded
and exposed in LinkedLifeData.

OWLIM does index the equivalence in the raw dataset and, under
appropriate conditions, will infer the symmetric dual.

Barry



On 19/12/12 10:24, Jerven Bolleman wrote:

Hi Barry,

The workbench is not involved its just using directly the sesame api.

e.g.
((GraphQuery) query).evaluate();

As the triple is explicit in the dataset. Which means 
setIncludeInferred

true or false should not matter. Testing with the debugger it is
unaffected.

Tobe clear we load the data using ruleset=none. Which might impact the
scenario.

The query used to work its just in the latest version that I notice it
being gone.

Regards,
Jerven

On 12/19/2012 11:03 AM, Barry Norton wrote:


Jerven, using owlim-se-5.3.5689 I just executed:

$ wget http://www.uniprot.org/uniprot/P68353.nt
$ curl -X POST -H Content-Type:text/turtle -T P68353.nt
http://localhost:8080/openrdf-sesame/repositories/uniprot/statements
$ curl -d query=DESCRIBE http://purl.uniprot.org/intact/EBI-530932
http://localhost:8080/openrdf-sesame/repositories/uniprot

I got:
@prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# .
@prefix rdfs: http://www.w3.org/2000/01/rdf-schema# .
@prefix sesame: http://www.openrdf.org/schema/sesame# .
@prefix owl: http://www.w3.org/2002/07/owl# .
@prefix xsd: http://www.w3.org/2001/XMLSchema# .
@prefix fn: http://www.w3.org/2005/xpath-functions# .

{
http://purl.uniprot.org/intact/EBI-530932 a
http://purl.uniprot.org/core/Participant ;
 owl:sameAs http://purl.uniprot.org/uniprot/P68359 ,
http://purl.uniprot.org/intact/EBI-530932 ;
 rdfs:label CSN4 .

*http://purl.uniprot.org/uniprot/P68359 owl:sameAs
http://purl.uniprot.org/intact/EBI-530932 .**

**http://purl.uniprot.org/intact/EBI-530932 owl:sameAs
http://purl.uniprot.org/intact/EBI-530932 .**
*
 foo:bar#_5036383335330019
http://purl.uniprot.org/core/participant
http://purl.uniprot.org/intact/EBI-530932 .
}

Can you provide any more details that might help to reproduce your
problem?

Could it simply be your parameter _equivalent=false (I believe 
there's a
checkbox in the UI), as the database clearly has triples expressing 
this

equivalence?

Barry



On 19/12/12 09:52, Jerven Bolleman wrote:

Dear ontotext developers,

I am missing a lot of owl:sameAs statements in our data.
I can reproduce the problem both on the linkedlifedata.com as at
beta.sparql.uniprot.org

The version is OwlimSchemaRepository: version: 5.3, revision: 5689

An example is
*MailScanner has detected a possible fraud attempt from
linkedlifedata.com claiming to be*

Re: [Owlim-discussion] {Disarmed} Re: missing owl:sameAs triples

2012-12-19 Thread Barry Norton


Further, I converted to NTriples (note, Jena gives a lot of warnings - 
it's probably worth duplicating this and taking note), and pulled out 
only the sameAs statements:


$ apache-jena-2.7.4/bin/rdfparse citations.rdf  citations.rdf.nt
$ wc citations.rdf.nt
  12693123  217642066 2491240008 citations.rdf.nt
$ grep 'sameAs' citations.rdf.nt  citations-sameAs.r.nt
$ wc citations-sameAs.rdf.nt
   789525   3158100 103875838 citations-sameAs.rdf.nt

These I upload by themselves:

$ curl -X POST -H Content-Type:text/turtle -T citations-sameAs.rdf.nt 
http://localhost:8080/openrdf-sesame/repositories/citations/statements


I received then the same results as below.

So I think we can disregard the effect of other data and concentrate 
purely on what's happening to these sameAs statements. This requires 
looking into the internals, so I'll leave you in Damyan's capable hands.


Barry




On 19/12/12 12:42, Barry Norton wrote:


Jerven, confirmed I can reproduce your problem thus:

$ wget 
ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/citations.rdf.gz

$ gunzip citations.rdf.gz
$ curl -X POST -H Content-Type:application/rdf+xml -T citations.rdf 
http://localhost:8080/openrdf-sesame/repositories/citations/statements


And now (unlike with P68353.nt) get only:
$ curl -d query=DESCRIBE http://purl.uniprot.org/uniprot/P68359 
http://localhost:8080/openrdf-sesame/repositories/citations

@prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# .
@prefix rdfs: http://www.w3.org/2000/01/rdf-schema# .
@prefix sesame: http://www.openrdf.org/schema/sesame# .
@prefix owl: http://www.w3.org/2002/07/owl# .
@prefix xsd: http://www.w3.org/2001/XMLSchema# .
@prefix fn: http://www.w3.org/2005/xpath-functions# .

[no results]

And:

$ curl -H Accept:application/sparql-results+json -d query=PREFIX 
owl: http://www.w3.org/2002/07/owl# SELECT * WHERE {?s owl:sameAs 
?o} LIMIT 10 http://localhost:8080/openrdf-sesame/repositories/citations

{
head: {
vars: [ s, o ]
},
results: {
bindings: [

]
}

[again, nothing]

Barry



On 19/12/12 11:50, Jerven Bolleman wrote:

Hi Barry,

No there are two systems which show the same problem.
All triples with owl:sameAs as predicate are gone.

i.e.
PREFIX owl: http://www.w3.org/2002/07/owl#
SELECT * WHERE {?s owl:sameAs ?o}
returns no results. Both at linkedlifedata and at
beta.sparql.uniprot.org.

Apparently Vassil claims that linkedlifedata deletes all owl:sameAs 
statements and therefore should not return results. This is silly but 
ok, if you really do that then you can not replicate my problem at 
that endpoint.


However, we at uniprot do not remove any owl:sameAs statement. We 
load these into the store and then do not get them back in query 
answering.


Basically all triples with owl:sameAs are silently deleted by the store.

A dataset that you can use for experimenting is 
ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/citations.rdf.gz


Which is moderate in size.

Regards,
Jerven


On 12/19/2012 11:38 AM, Barry Norton wrote:


Thanks for the clarification - I believe Vassil has answered the
question in the mean time, this is down to the way the data is loaded
and exposed in LinkedLifeData.

OWLIM does index the equivalence in the raw dataset and, under
appropriate conditions, will infer the symmetric dual.

Barry



On 19/12/12 10:24, Jerven Bolleman wrote:

Hi Barry,

The workbench is not involved its just using directly the sesame api.

e.g.
((GraphQuery) query).evaluate();

As the triple is explicit in the dataset. Which means 
setIncludeInferred

true or false should not matter. Testing with the debugger it is
unaffected.

Tobe clear we load the data using ruleset=none. Which might impact the
scenario.

The query used to work its just in the latest version that I notice it
being gone.

Regards,
Jerven

On 12/19/2012 11:03 AM, Barry Norton wrote:


Jerven, using owlim-se-5.3.5689 I just executed:

$ wget http://www.uniprot.org/uniprot/P68353.nt
$ curl -X POST -H Content-Type:text/turtle -T P68353.nt
http://localhost:8080/openrdf-sesame/repositories/uniprot/statements
$ curl -d query=DESCRIBE 
http://purl.uniprot.org/intact/EBI-530932

http://localhost:8080/openrdf-sesame/repositories/uniprot

I got:
@prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# .
@prefix rdfs: http://www.w3.org/2000/01/rdf-schema# .
@prefix sesame: http://www.openrdf.org/schema/sesame# .
@prefix owl: http://www.w3.org/2002/07/owl# .
@prefix xsd: http://www.w3.org/2001/XMLSchema# .
@prefix fn: http://www.w3.org/2005/xpath-functions# .

{
http://purl.uniprot.org/intact/EBI-530932 a
http://purl.uniprot.org/core/Participant ;
 owl:sameAs http://purl.uniprot.org/uniprot/P68359 ,
http://purl.uniprot.org/intact/EBI-530932 ;
 rdfs:label CSN4 .

*http://purl.uniprot.org/uniprot/P68359 owl:sameAs
http://purl.uniprot.org/intact/EBI-530932 .**

**http://purl.uniprot.org/intact/EBI-530932 owl:sameAs

Re: [Owlim-discussion] {Disarmed} Re: missing owl:sameAs triples

2012-12-19 Thread Jerven Bolleman
The jena warnings are another issue, on the uniprot rdf production site. 
Not for you to worry about, in any case will be fixed by the next

uniprot data release on the 9th of January.

Thanks for looking into the owl:sameAs issue.

Regards,
Jerven
On 12/19/2012 02:47 PM, Barry Norton wrote:


Further, I converted to NTriples (note, Jena gives a lot of warnings -
it's probably worth duplicating this and taking note), and pulled out
only the sameAs statements:

$ apache-jena-2.7.4/bin/rdfparse citations.rdf  citations.rdf.nt
$ wc citations.rdf.nt
   12693123  217642066 2491240008 citations.rdf.nt
$ grep 'sameAs' citations.rdf.nt  citations-sameAs.r.nt
$ wc citations-sameAs.rdf.nt
789525   3158100 103875838 citations-sameAs.rdf.nt

These I upload by themselves:

$ curl -X POST -H Content-Type:text/turtle -T citations-sameAs.rdf.nt
http://localhost:8080/openrdf-sesame/repositories/citations/statements

I received then the same results as below.

So I think we can disregard the effect of other data and concentrate
purely on what's happening to these sameAs statements. This requires
looking into the internals, so I'll leave you in Damyan's capable hands.

Barry




On 19/12/12 12:42, Barry Norton wrote:


Jerven, confirmed I can reproduce your problem thus:

$ wget
ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/citations.rdf.gz

$ gunzip citations.rdf.gz
$ curl -X POST -H Content-Type:application/rdf+xml -T citations.rdf
http://localhost:8080/openrdf-sesame/repositories/citations/statements

And now (unlike with P68353.nt) get only:
$ curl -d query=DESCRIBE http://purl.uniprot.org/uniprot/P68359
http://localhost:8080/openrdf-sesame/repositories/citations
@prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# .
@prefix rdfs: http://www.w3.org/2000/01/rdf-schema# .
@prefix sesame: http://www.openrdf.org/schema/sesame# .
@prefix owl: http://www.w3.org/2002/07/owl# .
@prefix xsd: http://www.w3.org/2001/XMLSchema# .
@prefix fn: http://www.w3.org/2005/xpath-functions# .

[no results]

And:

$ curl -H Accept:application/sparql-results+json -d query=PREFIX
owl: http://www.w3.org/2002/07/owl# SELECT * WHERE {?s owl:sameAs
?o} LIMIT 10 http://localhost:8080/openrdf-sesame/repositories/citations
{
head: {
vars: [ s, o ]
},
results: {
bindings: [

]
}

[again, nothing]

Barry



On 19/12/12 11:50, Jerven Bolleman wrote:

Hi Barry,

No there are two systems which show the same problem.
All triples with owl:sameAs as predicate are gone.

i.e.
PREFIX owl: http://www.w3.org/2002/07/owl#
SELECT * WHERE {?s owl:sameAs ?o}
returns no results. Both at linkedlifedata and at
beta.sparql.uniprot.org.

Apparently Vassil claims that linkedlifedata deletes all owl:sameAs
statements and therefore should not return results. This is silly but
ok, if you really do that then you can not replicate my problem at
that endpoint.

However, we at uniprot do not remove any owl:sameAs statement. We
load these into the store and then do not get them back in query
answering.

Basically all triples with owl:sameAs are silently deleted by the store.

A dataset that you can use for experimenting is
ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/citations.rdf.gz


Which is moderate in size.

Regards,
Jerven


On 12/19/2012 11:38 AM, Barry Norton wrote:


Thanks for the clarification - I believe Vassil has answered the
question in the mean time, this is down to the way the data is loaded
and exposed in LinkedLifeData.

OWLIM does index the equivalence in the raw dataset and, under
appropriate conditions, will infer the symmetric dual.

Barry



On 19/12/12 10:24, Jerven Bolleman wrote:

Hi Barry,

The workbench is not involved its just using directly the sesame api.

e.g.
((GraphQuery) query).evaluate();

As the triple is explicit in the dataset. Which means
setIncludeInferred
true or false should not matter. Testing with the debugger it is
unaffected.

Tobe clear we load the data using ruleset=none. Which might impact the
scenario.

The query used to work its just in the latest version that I notice it
being gone.

Regards,
Jerven

On 12/19/2012 11:03 AM, Barry Norton wrote:


Jerven, using owlim-se-5.3.5689 I just executed:

$ wget http://www.uniprot.org/uniprot/P68353.nt
$ curl -X POST -H Content-Type:text/turtle -T P68353.nt
http://localhost:8080/openrdf-sesame/repositories/uniprot/statements
$ curl -d query=DESCRIBE
http://purl.uniprot.org/intact/EBI-530932
http://localhost:8080/openrdf-sesame/repositories/uniprot

I got:
@prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# .
@prefix rdfs: http://www.w3.org/2000/01/rdf-schema# .
@prefix sesame: http://www.openrdf.org/schema/sesame# .
@prefix owl: http://www.w3.org/2002/07/owl# .
@prefix xsd: http://www.w3.org/2001/XMLSchema# .
@prefix fn: http://www.w3.org/2005/xpath-functions# .

{
http://purl.uniprot.org/intact/EBI-530932 a
http://purl.uniprot.org/core/Participant ;
 owl:sameAs 

Re: [Owlim-discussion] {Disarmed} Re: missing owl:sameAs triples

2012-12-19 Thread Barry Norton


You're welcome, just something I spotted along the way.

Sorry I've not found a solution, but I hope that reproducing and paring 
down is a step in the right direction and we can solve this quickly.


All the best,

Barry


On 19/12/2012 13:52, Jerven Bolleman wrote:
The jena warnings are another issue, on the uniprot rdf production 
site. Not for you to worry about, in any case will be fixed by the next

uniprot data release on the 9th of January.

Thanks for looking into the owl:sameAs issue.

Regards,
Jerven
On 12/19/2012 02:47 PM, Barry Norton wrote:


Further, I converted to NTriples (note, Jena gives a lot of warnings -
it's probably worth duplicating this and taking note), and pulled out
only the sameAs statements:

$ apache-jena-2.7.4/bin/rdfparse citations.rdf  citations.rdf.nt
$ wc citations.rdf.nt
   12693123  217642066 2491240008 citations.rdf.nt
$ grep 'sameAs' citations.rdf.nt  citations-sameAs.r.nt
$ wc citations-sameAs.rdf.nt
789525   3158100 103875838 citations-sameAs.rdf.nt

These I upload by themselves:

$ curl -X POST -H Content-Type:text/turtle -T citations-sameAs.rdf.nt
http://localhost:8080/openrdf-sesame/repositories/citations/statements

I received then the same results as below.

So I think we can disregard the effect of other data and concentrate
purely on what's happening to these sameAs statements. This requires
looking into the internals, so I'll leave you in Damyan's capable hands.

Barry




On 19/12/12 12:42, Barry Norton wrote:


Jerven, confirmed I can reproduce your problem thus:

$ wget
ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/citations.rdf.gz 



$ gunzip citations.rdf.gz
$ curl -X POST -H Content-Type:application/rdf+xml -T citations.rdf
http://localhost:8080/openrdf-sesame/repositories/citations/statements

And now (unlike with P68353.nt) get only:
$ curl -d query=DESCRIBE http://purl.uniprot.org/uniprot/P68359
http://localhost:8080/openrdf-sesame/repositories/citations
@prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# .
@prefix rdfs: http://www.w3.org/2000/01/rdf-schema# .
@prefix sesame: http://www.openrdf.org/schema/sesame# .
@prefix owl: http://www.w3.org/2002/07/owl# .
@prefix xsd: http://www.w3.org/2001/XMLSchema# .
@prefix fn: http://www.w3.org/2005/xpath-functions# .

[no results]

And:

$ curl -H Accept:application/sparql-results+json -d query=PREFIX
owl: http://www.w3.org/2002/07/owl# SELECT * WHERE {?s owl:sameAs
?o} LIMIT 10 
http://localhost:8080/openrdf-sesame/repositories/citations

{
head: {
vars: [ s, o ]
},
results: {
bindings: [

]
}

[again, nothing]

Barry



On 19/12/12 11:50, Jerven Bolleman wrote:

Hi Barry,

No there are two systems which show the same problem.
All triples with owl:sameAs as predicate are gone.

i.e.
PREFIX owl: http://www.w3.org/2002/07/owl#
SELECT * WHERE {?s owl:sameAs ?o}
returns no results. Both at linkedlifedata and at
beta.sparql.uniprot.org.

Apparently Vassil claims that linkedlifedata deletes all owl:sameAs
statements and therefore should not return results. This is silly but
ok, if you really do that then you can not replicate my problem at
that endpoint.

However, we at uniprot do not remove any owl:sameAs statement. We
load these into the store and then do not get them back in query
answering.

Basically all triples with owl:sameAs are silently deleted by the 
store.


A dataset that you can use for experimenting is
ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/citations.rdf.gz 




Which is moderate in size.

Regards,
Jerven


On 12/19/2012 11:38 AM, Barry Norton wrote:


Thanks for the clarification - I believe Vassil has answered the
question in the mean time, this is down to the way the data is loaded
and exposed in LinkedLifeData.

OWLIM does index the equivalence in the raw dataset and, under
appropriate conditions, will infer the symmetric dual.

Barry



On 19/12/12 10:24, Jerven Bolleman wrote:

Hi Barry,

The workbench is not involved its just using directly the sesame 
api.


e.g.
((GraphQuery) query).evaluate();

As the triple is explicit in the dataset. Which means
setIncludeInferred
true or false should not matter. Testing with the debugger it is
unaffected.

Tobe clear we load the data using ruleset=none. Which might 
impact the

scenario.

The query used to work its just in the latest version that I 
notice it

being gone.

Regards,
Jerven

On 12/19/2012 11:03 AM, Barry Norton wrote:


Jerven, using owlim-se-5.3.5689 I just executed:

$ wget http://www.uniprot.org/uniprot/P68353.nt
$ curl -X POST -H Content-Type:text/turtle -T P68353.nt
http://localhost:8080/openrdf-sesame/repositories/uniprot/statements 


$ curl -d query=DESCRIBE
http://purl.uniprot.org/intact/EBI-530932
http://localhost:8080/openrdf-sesame/repositories/uniprot

I got:
@prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# .
@prefix rdfs: http://www.w3.org/2000/01/rdf-schema# .
@prefix sesame: http://www.openrdf.org/schema/sesame# .

Re: [Owlim-discussion] {Disarmed} Re: missing owl:sameAs triples

2012-12-19 Thread damyan

Hi Jerven,

if your repository's ruleset is indeed preset to 'empty' and you do not 
use any inference then please use -Ddisable-sameAs=true JVM option on 
your service or alter the repository configuration until we release a 
proper 5.3 release with the fix ...


regards,
damyan


On 12/19/2012 3:52 PM, Jerven Bolleman wrote:
The jena warnings are another issue, on the uniprot rdf production 
site. Not for you to worry about, in any case will be fixed by the next

uniprot data release on the 9th of January.

Thanks for looking into the owl:sameAs issue.

Regards,
Jerven
On 12/19/2012 02:47 PM, Barry Norton wrote:


Further, I converted to NTriples (note, Jena gives a lot of warnings -
it's probably worth duplicating this and taking note), and pulled out
only the sameAs statements:

$ apache-jena-2.7.4/bin/rdfparse citations.rdf  citations.rdf.nt
$ wc citations.rdf.nt
   12693123  217642066 2491240008 citations.rdf.nt
$ grep 'sameAs' citations.rdf.nt  citations-sameAs.r.nt
$ wc citations-sameAs.rdf.nt
789525   3158100 103875838 citations-sameAs.rdf.nt

These I upload by themselves:

$ curl -X POST -H Content-Type:text/turtle -T citations-sameAs.rdf.nt
http://localhost:8080/openrdf-sesame/repositories/citations/statements

I received then the same results as below.

So I think we can disregard the effect of other data and concentrate
purely on what's happening to these sameAs statements. This requires
looking into the internals, so I'll leave you in Damyan's capable hands.

Barry




On 19/12/12 12:42, Barry Norton wrote:


Jerven, confirmed I can reproduce your problem thus:

$ wget
ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/citations.rdf.gz 



$ gunzip citations.rdf.gz
$ curl -X POST -H Content-Type:application/rdf+xml -T citations.rdf
http://localhost:8080/openrdf-sesame/repositories/citations/statements

And now (unlike with P68353.nt) get only:
$ curl -d query=DESCRIBE http://purl.uniprot.org/uniprot/P68359
http://localhost:8080/openrdf-sesame/repositories/citations
@prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# .
@prefix rdfs: http://www.w3.org/2000/01/rdf-schema# .
@prefix sesame: http://www.openrdf.org/schema/sesame# .
@prefix owl: http://www.w3.org/2002/07/owl# .
@prefix xsd: http://www.w3.org/2001/XMLSchema# .
@prefix fn: http://www.w3.org/2005/xpath-functions# .

[no results]

And:

$ curl -H Accept:application/sparql-results+json -d query=PREFIX
owl: http://www.w3.org/2002/07/owl# SELECT * WHERE {?s owl:sameAs
?o} LIMIT 10 
http://localhost:8080/openrdf-sesame/repositories/citations

{
head: {
vars: [ s, o ]
},
results: {
bindings: [

]
}

[again, nothing]

Barry



On 19/12/12 11:50, Jerven Bolleman wrote:

Hi Barry,

No there are two systems which show the same problem.
All triples with owl:sameAs as predicate are gone.

i.e.
PREFIX owl: http://www.w3.org/2002/07/owl#
SELECT * WHERE {?s owl:sameAs ?o}
returns no results. Both at linkedlifedata and at
beta.sparql.uniprot.org.

Apparently Vassil claims that linkedlifedata deletes all owl:sameAs
statements and therefore should not return results. This is silly but
ok, if you really do that then you can not replicate my problem at
that endpoint.

However, we at uniprot do not remove any owl:sameAs statement. We
load these into the store and then do not get them back in query
answering.

Basically all triples with owl:sameAs are silently deleted by the 
store.


A dataset that you can use for experimenting is
ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/citations.rdf.gz 




Which is moderate in size.

Regards,
Jerven


On 12/19/2012 11:38 AM, Barry Norton wrote:


Thanks for the clarification - I believe Vassil has answered the
question in the mean time, this is down to the way the data is loaded
and exposed in LinkedLifeData.

OWLIM does index the equivalence in the raw dataset and, under
appropriate conditions, will infer the symmetric dual.

Barry



On 19/12/12 10:24, Jerven Bolleman wrote:

Hi Barry,

The workbench is not involved its just using directly the sesame 
api.


e.g.
((GraphQuery) query).evaluate();

As the triple is explicit in the dataset. Which means
setIncludeInferred
true or false should not matter. Testing with the debugger it is
unaffected.

Tobe clear we load the data using ruleset=none. Which might 
impact the

scenario.

The query used to work its just in the latest version that I 
notice it

being gone.

Regards,
Jerven

On 12/19/2012 11:03 AM, Barry Norton wrote:


Jerven, using owlim-se-5.3.5689 I just executed:

$ wget http://www.uniprot.org/uniprot/P68353.nt
$ curl -X POST -H Content-Type:text/turtle -T P68353.nt
http://localhost:8080/openrdf-sesame/repositories/uniprot/statements 


$ curl -d query=DESCRIBE
http://purl.uniprot.org/intact/EBI-530932
http://localhost:8080/openrdf-sesame/repositories/uniprot

I got:
@prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# .
@prefix rdfs: