Re: [Owlim-discussion] {Disarmed} Re: missing owl:sameAs triples
Hi Barry, The workbench is not involved its just using directly the sesame api. e.g. ((GraphQuery) query).evaluate(); As the triple is explicit in the dataset. Which means setIncludeInferred true or false should not matter. Testing with the debugger it is unaffected. Tobe clear we load the data using ruleset=none. Which might impact the scenario. The query used to work its just in the latest version that I notice it being gone. Regards, Jerven On 12/19/2012 11:03 AM, Barry Norton wrote: Jerven, using owlim-se-5.3.5689 I just executed: $ wget http://www.uniprot.org/uniprot/P68353.nt $ curl -X POST -H Content-Type:text/turtle -T P68353.nt http://localhost:8080/openrdf-sesame/repositories/uniprot/statements $ curl -d query=DESCRIBE http://purl.uniprot.org/intact/EBI-530932 http://localhost:8080/openrdf-sesame/repositories/uniprot I got: @prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# . @prefix rdfs: http://www.w3.org/2000/01/rdf-schema# . @prefix sesame: http://www.openrdf.org/schema/sesame# . @prefix owl: http://www.w3.org/2002/07/owl# . @prefix xsd: http://www.w3.org/2001/XMLSchema# . @prefix fn: http://www.w3.org/2005/xpath-functions# . { http://purl.uniprot.org/intact/EBI-530932 a http://purl.uniprot.org/core/Participant ; owl:sameAs http://purl.uniprot.org/uniprot/P68359 , http://purl.uniprot.org/intact/EBI-530932 ; rdfs:label CSN4 . *http://purl.uniprot.org/uniprot/P68359 owl:sameAs http://purl.uniprot.org/intact/EBI-530932 .** **http://purl.uniprot.org/intact/EBI-530932 owl:sameAs http://purl.uniprot.org/intact/EBI-530932 .** * foo:bar#_5036383335330019 http://purl.uniprot.org/core/participant http://purl.uniprot.org/intact/EBI-530932 . } Can you provide any more details that might help to reproduce your problem? Could it simply be your parameter _equivalent=false (I believe there's a checkbox in the UI), as the database clearly has triples expressing this equivalence? Barry On 19/12/12 09:52, Jerven Bolleman wrote: Dear ontotext developers, I am missing a lot of owl:sameAs statements in our data. I can reproduce the problem both on the linkedlifedata.com as at beta.sparql.uniprot.org The version is OwlimSchemaRepository: version: 5.3, revision: 5689 An example is *MailScanner has detected a possible fraud attempt from linkedlifedata.com claiming to be* http://linkedlifedata.com/sparql?query=DESCRIBE+%3Chttp%3A%2F%2Fpurl.uniprot.org%2Fintact%2FEBI-530927%3E_implicit=falseimplicit=true_equivalent=false_form=%2Fsparql The following triple is missing http://purl.uniprot.org/intact/EBI-530932 http://www.w3.org/2002/07/owl#sameAs http://purl.uniprot.org/uniprot/P68359 . Which you can find in the source data i.e. http://www.uniprot.org/uniprot/P68353.nt This is messing up quite a few queries on our end could you please look into this as soon as possible. This also affects select and ask queries so its not just describe. Regards, Jerven ___ Owlim-discussion mailing list Owlim-discussion@ontotext.com http://ontomail.semdata.org/cgi-bin/mailman/listinfo/owlim-discussion -- --- Jerven Bollemanjerven.bolle...@isb-sib.ch SIB Swiss Institute of Bioinformatics Tel: +41 (0)22 379 58 85 CMU, rue Michel Servet 1 Fax: +41 (0)22 379 58 58 1211 Geneve 4, Switzerland www.isb-sib.ch - www.uniprot.org Follow us at https://twitter.com/#!/uniprot --- ___ Owlim-discussion mailing list Owlim-discussion@ontotext.com http://ontomail.semdata.org/cgi-bin/mailman/listinfo/owlim-discussion
Re: [Owlim-discussion] {Disarmed} Re: missing owl:sameAs triples
Thanks for the clarification - I believe Vassil has answered the question in the mean time, this is down to the way the data is loaded and exposed in LinkedLifeData. OWLIM does index the equivalence in the raw dataset and, under appropriate conditions, will infer the symmetric dual. Barry On 19/12/12 10:24, Jerven Bolleman wrote: Hi Barry, The workbench is not involved its just using directly the sesame api. e.g. ((GraphQuery) query).evaluate(); As the triple is explicit in the dataset. Which means setIncludeInferred true or false should not matter. Testing with the debugger it is unaffected. Tobe clear we load the data using ruleset=none. Which might impact the scenario. The query used to work its just in the latest version that I notice it being gone. Regards, Jerven On 12/19/2012 11:03 AM, Barry Norton wrote: Jerven, using owlim-se-5.3.5689 I just executed: $ wget http://www.uniprot.org/uniprot/P68353.nt $ curl -X POST -H Content-Type:text/turtle -T P68353.nt http://localhost:8080/openrdf-sesame/repositories/uniprot/statements $ curl -d query=DESCRIBE http://purl.uniprot.org/intact/EBI-530932 http://localhost:8080/openrdf-sesame/repositories/uniprot I got: @prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# . @prefix rdfs: http://www.w3.org/2000/01/rdf-schema# . @prefix sesame: http://www.openrdf.org/schema/sesame# . @prefix owl: http://www.w3.org/2002/07/owl# . @prefix xsd: http://www.w3.org/2001/XMLSchema# . @prefix fn: http://www.w3.org/2005/xpath-functions# . { http://purl.uniprot.org/intact/EBI-530932 a http://purl.uniprot.org/core/Participant ; owl:sameAs http://purl.uniprot.org/uniprot/P68359 , http://purl.uniprot.org/intact/EBI-530932 ; rdfs:label CSN4 . *http://purl.uniprot.org/uniprot/P68359 owl:sameAs http://purl.uniprot.org/intact/EBI-530932 .** **http://purl.uniprot.org/intact/EBI-530932 owl:sameAs http://purl.uniprot.org/intact/EBI-530932 .** * foo:bar#_5036383335330019 http://purl.uniprot.org/core/participant http://purl.uniprot.org/intact/EBI-530932 . } Can you provide any more details that might help to reproduce your problem? Could it simply be your parameter _equivalent=false (I believe there's a checkbox in the UI), as the database clearly has triples expressing this equivalence? Barry On 19/12/12 09:52, Jerven Bolleman wrote: Dear ontotext developers, I am missing a lot of owl:sameAs statements in our data. I can reproduce the problem both on the linkedlifedata.com as at beta.sparql.uniprot.org The version is OwlimSchemaRepository: version: 5.3, revision: 5689 An example is *MailScanner has detected a possible fraud attempt from linkedlifedata.com claiming to be* http://linkedlifedata.com/sparql?query=DESCRIBE+%3Chttp%3A%2F%2Fpurl.uniprot.org%2Fintact%2FEBI-530927%3E_implicit=falseimplicit=true_equivalent=false_form=%2Fsparql The following triple is missing http://purl.uniprot.org/intact/EBI-530932 http://www.w3.org/2002/07/owl#sameAs http://purl.uniprot.org/uniprot/P68359 . Which you can find in the source data i.e. http://www.uniprot.org/uniprot/P68353.nt This is messing up quite a few queries on our end could you please look into this as soon as possible. This also affects select and ask queries so its not just describe. Regards, Jerven ___ Owlim-discussion mailing list Owlim-discussion@ontotext.com http://ontomail.semdata.org/cgi-bin/mailman/listinfo/owlim-discussion ___ Owlim-discussion mailing list Owlim-discussion@ontotext.com http://ontomail.semdata.org/cgi-bin/mailman/listinfo/owlim-discussion
Re: [Owlim-discussion] {Disarmed} Re: missing owl:sameAs triples
Hi Barry, No there are two systems which show the same problem. All triples with owl:sameAs as predicate are gone. i.e. PREFIX owl: http://www.w3.org/2002/07/owl# SELECT * WHERE {?s owl:sameAs ?o} returns no results. Both at linkedlifedata and at beta.sparql.uniprot.org. Apparently Vassil claims that linkedlifedata deletes all owl:sameAs statements and therefore should not return results. This is silly but ok, if you really do that then you can not replicate my problem at that endpoint. However, we at uniprot do not remove any owl:sameAs statement. We load these into the store and then do not get them back in query answering. Basically all triples with owl:sameAs are silently deleted by the store. A dataset that you can use for experimenting is ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/citations.rdf.gz Which is moderate in size. Regards, Jerven On 12/19/2012 11:38 AM, Barry Norton wrote: Thanks for the clarification - I believe Vassil has answered the question in the mean time, this is down to the way the data is loaded and exposed in LinkedLifeData. OWLIM does index the equivalence in the raw dataset and, under appropriate conditions, will infer the symmetric dual. Barry On 19/12/12 10:24, Jerven Bolleman wrote: Hi Barry, The workbench is not involved its just using directly the sesame api. e.g. ((GraphQuery) query).evaluate(); As the triple is explicit in the dataset. Which means setIncludeInferred true or false should not matter. Testing with the debugger it is unaffected. Tobe clear we load the data using ruleset=none. Which might impact the scenario. The query used to work its just in the latest version that I notice it being gone. Regards, Jerven On 12/19/2012 11:03 AM, Barry Norton wrote: Jerven, using owlim-se-5.3.5689 I just executed: $ wget http://www.uniprot.org/uniprot/P68353.nt $ curl -X POST -H Content-Type:text/turtle -T P68353.nt http://localhost:8080/openrdf-sesame/repositories/uniprot/statements $ curl -d query=DESCRIBE http://purl.uniprot.org/intact/EBI-530932 http://localhost:8080/openrdf-sesame/repositories/uniprot I got: @prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# . @prefix rdfs: http://www.w3.org/2000/01/rdf-schema# . @prefix sesame: http://www.openrdf.org/schema/sesame# . @prefix owl: http://www.w3.org/2002/07/owl# . @prefix xsd: http://www.w3.org/2001/XMLSchema# . @prefix fn: http://www.w3.org/2005/xpath-functions# . { http://purl.uniprot.org/intact/EBI-530932 a http://purl.uniprot.org/core/Participant ; owl:sameAs http://purl.uniprot.org/uniprot/P68359 , http://purl.uniprot.org/intact/EBI-530932 ; rdfs:label CSN4 . *http://purl.uniprot.org/uniprot/P68359 owl:sameAs http://purl.uniprot.org/intact/EBI-530932 .** **http://purl.uniprot.org/intact/EBI-530932 owl:sameAs http://purl.uniprot.org/intact/EBI-530932 .** * foo:bar#_5036383335330019 http://purl.uniprot.org/core/participant http://purl.uniprot.org/intact/EBI-530932 . } Can you provide any more details that might help to reproduce your problem? Could it simply be your parameter _equivalent=false (I believe there's a checkbox in the UI), as the database clearly has triples expressing this equivalence? Barry On 19/12/12 09:52, Jerven Bolleman wrote: Dear ontotext developers, I am missing a lot of owl:sameAs statements in our data. I can reproduce the problem both on the linkedlifedata.com as at beta.sparql.uniprot.org The version is OwlimSchemaRepository: version: 5.3, revision: 5689 An example is *MailScanner has detected a possible fraud attempt from linkedlifedata.com claiming to be* http://linkedlifedata.com/sparql?query=DESCRIBE+%3Chttp%3A%2F%2Fpurl.uniprot.org%2Fintact%2FEBI-530927%3E_implicit=falseimplicit=true_equivalent=false_form=%2Fsparql The following triple is missing http://purl.uniprot.org/intact/EBI-530932 http://www.w3.org/2002/07/owl#sameAs http://purl.uniprot.org/uniprot/P68359 . Which you can find in the source data i.e. http://www.uniprot.org/uniprot/P68353.nt This is messing up quite a few queries on our end could you please look into this as soon as possible. This also affects select and ask queries so its not just describe. Regards, Jerven ___ Owlim-discussion mailing list Owlim-discussion@ontotext.com http://ontomail.semdata.org/cgi-bin/mailman/listinfo/owlim-discussion ___ Owlim-discussion mailing list Owlim-discussion@ontotext.com http://ontomail.semdata.org/cgi-bin/mailman/listinfo/owlim-discussion -- --- Jerven Bollemanjerven.bolle...@isb-sib.ch SIB Swiss Institute of Bioinformatics Tel: +41 (0)22 379 58 85 CMU, rue Michel Servet 1 Fax: +41 (0)22 379 58 58 1211 Geneve 4, Switzerland www.isb-sib.ch - www.uniprot.org Follow us at https://twitter.com/#!/uniprot
Re: [Owlim-discussion] {Disarmed} Re: missing owl:sameAs triples
On 12/19/2012 12:01 PM, Vassil Momtchev wrote: Jerven, May I rephrase your issue and double check if I correctly understood it. After upgrading to OWLIM version 5.3 revision 5689 you observed missing sameAs triples. The system to reproduce this behavior is beta.sparql.uniprot.org. Yes. I see a difference in the posted resources P68359 vs P68353 (is it sameAs the link between the two resources?). Could you clarify the following points: * What's the missing statement? Lets use a simpler example http://beta.sparql.uniprot.org/sparql?query=DESCRIBE+%3Chttp%3A%2F%2Fpurl.uniprot.org%2Fcitations%2F19660865%3E i.e. DESCRIBE http://purl.uniprot.org/citations/19660865 with source data http://www.uniprot.org/citations/19660865.nt Misses http://purl.uniprot.org/citations/19660865 http://www.w3.org/2002/07/owl#sameAs http://www.ncbi.nlm.nih.gov/pubmed/19660865 once loaded into OWLIM * Is it only a single triple or all statement with owl:sameAs predicate? all triples with the owl:sameAs predicate * What's the original resource loaded to OWLIM? ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/citations.rdf.gz Which contains the owl:sameAs links This is enough to demonstrate the problem. * Did you use the same revision to load the data or you simply updated the OWLIM database? Same revision. Started totally from scratch. Regards, Jerven Thanks, V. On 12/19/2012 11:52 AM, Jerven Bolleman wrote: Dear ontotext developers, I am missing a lot of owl:sameAs statements in our data. I can reproduce the problem both on the linkedlifedata.com as at beta.sparql.uniprot.org The version is OwlimSchemaRepository: version: 5.3, revision: 5689 An example is *MailScanner has detected a possible fraud attempt from linkedlifedata.com claiming to be* http://linkedlifedata.com/sparql?query=DESCRIBE+%3Chttp%3A%2F%2Fpurl.uniprot.org%2Fintact%2FEBI-530927%3E_implicit=falseimplicit=true_equivalent=false_form=%2Fsparql The following triple is missing http://purl.uniprot.org/intact/EBI-530932 http://www.w3.org/2002/07/owl#sameAs http://purl.uniprot.org/uniprot/P68359 . Which you can find in the source data i.e. http://www.uniprot.org/uniprot/P68353.nt This is messing up quite a few queries on our end could you please look into this as soon as possible. This also affects select and ask queries so its not just describe. Regards, Jerven -- -- Vassil Momtchev Head of Life Science RD of Ontotext AD http://www.ontotext.com Skype: vassil_momtchev -- ___ Owlim-discussion mailing list Owlim-discussion@ontotext.com http://ontomail.semdata.org/cgi-bin/mailman/listinfo/owlim-discussion -- --- Jerven Bollemanjerven.bolle...@isb-sib.ch SIB Swiss Institute of Bioinformatics Tel: +41 (0)22 379 58 85 CMU, rue Michel Servet 1 Fax: +41 (0)22 379 58 58 1211 Geneve 4, Switzerland www.isb-sib.ch - www.uniprot.org Follow us at https://twitter.com/#!/uniprot --- ___ Owlim-discussion mailing list Owlim-discussion@ontotext.com http://ontomail.semdata.org/cgi-bin/mailman/listinfo/owlim-discussion
Re: [Owlim-discussion] {Disarmed} Re: missing owl:sameAs triples
Jerven, confirmed I can reproduce your problem thus: $ wget ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/citations.rdf.gz $ gunzip citations.rdf.gz $ curl -X POST -H Content-Type:application/rdf+xml -T citations.rdf http://localhost:8080/openrdf-sesame/repositories/citations/statements And now (unlike with P68353.nt) get only: $ curl -d query=DESCRIBE http://purl.uniprot.org/uniprot/P68359 http://localhost:8080/openrdf-sesame/repositories/citations @prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# . @prefix rdfs: http://www.w3.org/2000/01/rdf-schema# . @prefix sesame: http://www.openrdf.org/schema/sesame# . @prefix owl: http://www.w3.org/2002/07/owl# . @prefix xsd: http://www.w3.org/2001/XMLSchema# . @prefix fn: http://www.w3.org/2005/xpath-functions# . [no results] And: $ curl -H Accept:application/sparql-results+json -d query=PREFIX owl: http://www.w3.org/2002/07/owl# SELECT * WHERE {?s owl:sameAs ?o} LIMIT 10 http://localhost:8080/openrdf-sesame/repositories/citations { head: { vars: [ s, o ] }, results: { bindings: [ ] } [again, nothing] Barry On 19/12/12 11:50, Jerven Bolleman wrote: Hi Barry, No there are two systems which show the same problem. All triples with owl:sameAs as predicate are gone. i.e. PREFIX owl: http://www.w3.org/2002/07/owl# SELECT * WHERE {?s owl:sameAs ?o} returns no results. Both at linkedlifedata and at beta.sparql.uniprot.org. Apparently Vassil claims that linkedlifedata deletes all owl:sameAs statements and therefore should not return results. This is silly but ok, if you really do that then you can not replicate my problem at that endpoint. However, we at uniprot do not remove any owl:sameAs statement. We load these into the store and then do not get them back in query answering. Basically all triples with owl:sameAs are silently deleted by the store. A dataset that you can use for experimenting is ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/citations.rdf.gz Which is moderate in size. Regards, Jerven On 12/19/2012 11:38 AM, Barry Norton wrote: Thanks for the clarification - I believe Vassil has answered the question in the mean time, this is down to the way the data is loaded and exposed in LinkedLifeData. OWLIM does index the equivalence in the raw dataset and, under appropriate conditions, will infer the symmetric dual. Barry On 19/12/12 10:24, Jerven Bolleman wrote: Hi Barry, The workbench is not involved its just using directly the sesame api. e.g. ((GraphQuery) query).evaluate(); As the triple is explicit in the dataset. Which means setIncludeInferred true or false should not matter. Testing with the debugger it is unaffected. Tobe clear we load the data using ruleset=none. Which might impact the scenario. The query used to work its just in the latest version that I notice it being gone. Regards, Jerven On 12/19/2012 11:03 AM, Barry Norton wrote: Jerven, using owlim-se-5.3.5689 I just executed: $ wget http://www.uniprot.org/uniprot/P68353.nt $ curl -X POST -H Content-Type:text/turtle -T P68353.nt http://localhost:8080/openrdf-sesame/repositories/uniprot/statements $ curl -d query=DESCRIBE http://purl.uniprot.org/intact/EBI-530932 http://localhost:8080/openrdf-sesame/repositories/uniprot I got: @prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# . @prefix rdfs: http://www.w3.org/2000/01/rdf-schema# . @prefix sesame: http://www.openrdf.org/schema/sesame# . @prefix owl: http://www.w3.org/2002/07/owl# . @prefix xsd: http://www.w3.org/2001/XMLSchema# . @prefix fn: http://www.w3.org/2005/xpath-functions# . { http://purl.uniprot.org/intact/EBI-530932 a http://purl.uniprot.org/core/Participant ; owl:sameAs http://purl.uniprot.org/uniprot/P68359 , http://purl.uniprot.org/intact/EBI-530932 ; rdfs:label CSN4 . *http://purl.uniprot.org/uniprot/P68359 owl:sameAs http://purl.uniprot.org/intact/EBI-530932 .** **http://purl.uniprot.org/intact/EBI-530932 owl:sameAs http://purl.uniprot.org/intact/EBI-530932 .** * foo:bar#_5036383335330019 http://purl.uniprot.org/core/participant http://purl.uniprot.org/intact/EBI-530932 . } Can you provide any more details that might help to reproduce your problem? Could it simply be your parameter _equivalent=false (I believe there's a checkbox in the UI), as the database clearly has triples expressing this equivalence? Barry On 19/12/12 09:52, Jerven Bolleman wrote: Dear ontotext developers, I am missing a lot of owl:sameAs statements in our data. I can reproduce the problem both on the linkedlifedata.com as at beta.sparql.uniprot.org The version is OwlimSchemaRepository: version: 5.3, revision: 5689 An example is *MailScanner has detected a possible fraud attempt from linkedlifedata.com claiming to be*
Re: [Owlim-discussion] {Disarmed} Re: missing owl:sameAs triples
Further, I converted to NTriples (note, Jena gives a lot of warnings - it's probably worth duplicating this and taking note), and pulled out only the sameAs statements: $ apache-jena-2.7.4/bin/rdfparse citations.rdf citations.rdf.nt $ wc citations.rdf.nt 12693123 217642066 2491240008 citations.rdf.nt $ grep 'sameAs' citations.rdf.nt citations-sameAs.r.nt $ wc citations-sameAs.rdf.nt 789525 3158100 103875838 citations-sameAs.rdf.nt These I upload by themselves: $ curl -X POST -H Content-Type:text/turtle -T citations-sameAs.rdf.nt http://localhost:8080/openrdf-sesame/repositories/citations/statements I received then the same results as below. So I think we can disregard the effect of other data and concentrate purely on what's happening to these sameAs statements. This requires looking into the internals, so I'll leave you in Damyan's capable hands. Barry On 19/12/12 12:42, Barry Norton wrote: Jerven, confirmed I can reproduce your problem thus: $ wget ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/citations.rdf.gz $ gunzip citations.rdf.gz $ curl -X POST -H Content-Type:application/rdf+xml -T citations.rdf http://localhost:8080/openrdf-sesame/repositories/citations/statements And now (unlike with P68353.nt) get only: $ curl -d query=DESCRIBE http://purl.uniprot.org/uniprot/P68359 http://localhost:8080/openrdf-sesame/repositories/citations @prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# . @prefix rdfs: http://www.w3.org/2000/01/rdf-schema# . @prefix sesame: http://www.openrdf.org/schema/sesame# . @prefix owl: http://www.w3.org/2002/07/owl# . @prefix xsd: http://www.w3.org/2001/XMLSchema# . @prefix fn: http://www.w3.org/2005/xpath-functions# . [no results] And: $ curl -H Accept:application/sparql-results+json -d query=PREFIX owl: http://www.w3.org/2002/07/owl# SELECT * WHERE {?s owl:sameAs ?o} LIMIT 10 http://localhost:8080/openrdf-sesame/repositories/citations { head: { vars: [ s, o ] }, results: { bindings: [ ] } [again, nothing] Barry On 19/12/12 11:50, Jerven Bolleman wrote: Hi Barry, No there are two systems which show the same problem. All triples with owl:sameAs as predicate are gone. i.e. PREFIX owl: http://www.w3.org/2002/07/owl# SELECT * WHERE {?s owl:sameAs ?o} returns no results. Both at linkedlifedata and at beta.sparql.uniprot.org. Apparently Vassil claims that linkedlifedata deletes all owl:sameAs statements and therefore should not return results. This is silly but ok, if you really do that then you can not replicate my problem at that endpoint. However, we at uniprot do not remove any owl:sameAs statement. We load these into the store and then do not get them back in query answering. Basically all triples with owl:sameAs are silently deleted by the store. A dataset that you can use for experimenting is ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/citations.rdf.gz Which is moderate in size. Regards, Jerven On 12/19/2012 11:38 AM, Barry Norton wrote: Thanks for the clarification - I believe Vassil has answered the question in the mean time, this is down to the way the data is loaded and exposed in LinkedLifeData. OWLIM does index the equivalence in the raw dataset and, under appropriate conditions, will infer the symmetric dual. Barry On 19/12/12 10:24, Jerven Bolleman wrote: Hi Barry, The workbench is not involved its just using directly the sesame api. e.g. ((GraphQuery) query).evaluate(); As the triple is explicit in the dataset. Which means setIncludeInferred true or false should not matter. Testing with the debugger it is unaffected. Tobe clear we load the data using ruleset=none. Which might impact the scenario. The query used to work its just in the latest version that I notice it being gone. Regards, Jerven On 12/19/2012 11:03 AM, Barry Norton wrote: Jerven, using owlim-se-5.3.5689 I just executed: $ wget http://www.uniprot.org/uniprot/P68353.nt $ curl -X POST -H Content-Type:text/turtle -T P68353.nt http://localhost:8080/openrdf-sesame/repositories/uniprot/statements $ curl -d query=DESCRIBE http://purl.uniprot.org/intact/EBI-530932 http://localhost:8080/openrdf-sesame/repositories/uniprot I got: @prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# . @prefix rdfs: http://www.w3.org/2000/01/rdf-schema# . @prefix sesame: http://www.openrdf.org/schema/sesame# . @prefix owl: http://www.w3.org/2002/07/owl# . @prefix xsd: http://www.w3.org/2001/XMLSchema# . @prefix fn: http://www.w3.org/2005/xpath-functions# . { http://purl.uniprot.org/intact/EBI-530932 a http://purl.uniprot.org/core/Participant ; owl:sameAs http://purl.uniprot.org/uniprot/P68359 , http://purl.uniprot.org/intact/EBI-530932 ; rdfs:label CSN4 . *http://purl.uniprot.org/uniprot/P68359 owl:sameAs http://purl.uniprot.org/intact/EBI-530932 .** **http://purl.uniprot.org/intact/EBI-530932 owl:sameAs
Re: [Owlim-discussion] {Disarmed} Re: missing owl:sameAs triples
The jena warnings are another issue, on the uniprot rdf production site. Not for you to worry about, in any case will be fixed by the next uniprot data release on the 9th of January. Thanks for looking into the owl:sameAs issue. Regards, Jerven On 12/19/2012 02:47 PM, Barry Norton wrote: Further, I converted to NTriples (note, Jena gives a lot of warnings - it's probably worth duplicating this and taking note), and pulled out only the sameAs statements: $ apache-jena-2.7.4/bin/rdfparse citations.rdf citations.rdf.nt $ wc citations.rdf.nt 12693123 217642066 2491240008 citations.rdf.nt $ grep 'sameAs' citations.rdf.nt citations-sameAs.r.nt $ wc citations-sameAs.rdf.nt 789525 3158100 103875838 citations-sameAs.rdf.nt These I upload by themselves: $ curl -X POST -H Content-Type:text/turtle -T citations-sameAs.rdf.nt http://localhost:8080/openrdf-sesame/repositories/citations/statements I received then the same results as below. So I think we can disregard the effect of other data and concentrate purely on what's happening to these sameAs statements. This requires looking into the internals, so I'll leave you in Damyan's capable hands. Barry On 19/12/12 12:42, Barry Norton wrote: Jerven, confirmed I can reproduce your problem thus: $ wget ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/citations.rdf.gz $ gunzip citations.rdf.gz $ curl -X POST -H Content-Type:application/rdf+xml -T citations.rdf http://localhost:8080/openrdf-sesame/repositories/citations/statements And now (unlike with P68353.nt) get only: $ curl -d query=DESCRIBE http://purl.uniprot.org/uniprot/P68359 http://localhost:8080/openrdf-sesame/repositories/citations @prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# . @prefix rdfs: http://www.w3.org/2000/01/rdf-schema# . @prefix sesame: http://www.openrdf.org/schema/sesame# . @prefix owl: http://www.w3.org/2002/07/owl# . @prefix xsd: http://www.w3.org/2001/XMLSchema# . @prefix fn: http://www.w3.org/2005/xpath-functions# . [no results] And: $ curl -H Accept:application/sparql-results+json -d query=PREFIX owl: http://www.w3.org/2002/07/owl# SELECT * WHERE {?s owl:sameAs ?o} LIMIT 10 http://localhost:8080/openrdf-sesame/repositories/citations { head: { vars: [ s, o ] }, results: { bindings: [ ] } [again, nothing] Barry On 19/12/12 11:50, Jerven Bolleman wrote: Hi Barry, No there are two systems which show the same problem. All triples with owl:sameAs as predicate are gone. i.e. PREFIX owl: http://www.w3.org/2002/07/owl# SELECT * WHERE {?s owl:sameAs ?o} returns no results. Both at linkedlifedata and at beta.sparql.uniprot.org. Apparently Vassil claims that linkedlifedata deletes all owl:sameAs statements and therefore should not return results. This is silly but ok, if you really do that then you can not replicate my problem at that endpoint. However, we at uniprot do not remove any owl:sameAs statement. We load these into the store and then do not get them back in query answering. Basically all triples with owl:sameAs are silently deleted by the store. A dataset that you can use for experimenting is ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/citations.rdf.gz Which is moderate in size. Regards, Jerven On 12/19/2012 11:38 AM, Barry Norton wrote: Thanks for the clarification - I believe Vassil has answered the question in the mean time, this is down to the way the data is loaded and exposed in LinkedLifeData. OWLIM does index the equivalence in the raw dataset and, under appropriate conditions, will infer the symmetric dual. Barry On 19/12/12 10:24, Jerven Bolleman wrote: Hi Barry, The workbench is not involved its just using directly the sesame api. e.g. ((GraphQuery) query).evaluate(); As the triple is explicit in the dataset. Which means setIncludeInferred true or false should not matter. Testing with the debugger it is unaffected. Tobe clear we load the data using ruleset=none. Which might impact the scenario. The query used to work its just in the latest version that I notice it being gone. Regards, Jerven On 12/19/2012 11:03 AM, Barry Norton wrote: Jerven, using owlim-se-5.3.5689 I just executed: $ wget http://www.uniprot.org/uniprot/P68353.nt $ curl -X POST -H Content-Type:text/turtle -T P68353.nt http://localhost:8080/openrdf-sesame/repositories/uniprot/statements $ curl -d query=DESCRIBE http://purl.uniprot.org/intact/EBI-530932 http://localhost:8080/openrdf-sesame/repositories/uniprot I got: @prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# . @prefix rdfs: http://www.w3.org/2000/01/rdf-schema# . @prefix sesame: http://www.openrdf.org/schema/sesame# . @prefix owl: http://www.w3.org/2002/07/owl# . @prefix xsd: http://www.w3.org/2001/XMLSchema# . @prefix fn: http://www.w3.org/2005/xpath-functions# . { http://purl.uniprot.org/intact/EBI-530932 a http://purl.uniprot.org/core/Participant ; owl:sameAs
Re: [Owlim-discussion] {Disarmed} Re: missing owl:sameAs triples
You're welcome, just something I spotted along the way. Sorry I've not found a solution, but I hope that reproducing and paring down is a step in the right direction and we can solve this quickly. All the best, Barry On 19/12/2012 13:52, Jerven Bolleman wrote: The jena warnings are another issue, on the uniprot rdf production site. Not for you to worry about, in any case will be fixed by the next uniprot data release on the 9th of January. Thanks for looking into the owl:sameAs issue. Regards, Jerven On 12/19/2012 02:47 PM, Barry Norton wrote: Further, I converted to NTriples (note, Jena gives a lot of warnings - it's probably worth duplicating this and taking note), and pulled out only the sameAs statements: $ apache-jena-2.7.4/bin/rdfparse citations.rdf citations.rdf.nt $ wc citations.rdf.nt 12693123 217642066 2491240008 citations.rdf.nt $ grep 'sameAs' citations.rdf.nt citations-sameAs.r.nt $ wc citations-sameAs.rdf.nt 789525 3158100 103875838 citations-sameAs.rdf.nt These I upload by themselves: $ curl -X POST -H Content-Type:text/turtle -T citations-sameAs.rdf.nt http://localhost:8080/openrdf-sesame/repositories/citations/statements I received then the same results as below. So I think we can disregard the effect of other data and concentrate purely on what's happening to these sameAs statements. This requires looking into the internals, so I'll leave you in Damyan's capable hands. Barry On 19/12/12 12:42, Barry Norton wrote: Jerven, confirmed I can reproduce your problem thus: $ wget ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/citations.rdf.gz $ gunzip citations.rdf.gz $ curl -X POST -H Content-Type:application/rdf+xml -T citations.rdf http://localhost:8080/openrdf-sesame/repositories/citations/statements And now (unlike with P68353.nt) get only: $ curl -d query=DESCRIBE http://purl.uniprot.org/uniprot/P68359 http://localhost:8080/openrdf-sesame/repositories/citations @prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# . @prefix rdfs: http://www.w3.org/2000/01/rdf-schema# . @prefix sesame: http://www.openrdf.org/schema/sesame# . @prefix owl: http://www.w3.org/2002/07/owl# . @prefix xsd: http://www.w3.org/2001/XMLSchema# . @prefix fn: http://www.w3.org/2005/xpath-functions# . [no results] And: $ curl -H Accept:application/sparql-results+json -d query=PREFIX owl: http://www.w3.org/2002/07/owl# SELECT * WHERE {?s owl:sameAs ?o} LIMIT 10 http://localhost:8080/openrdf-sesame/repositories/citations { head: { vars: [ s, o ] }, results: { bindings: [ ] } [again, nothing] Barry On 19/12/12 11:50, Jerven Bolleman wrote: Hi Barry, No there are two systems which show the same problem. All triples with owl:sameAs as predicate are gone. i.e. PREFIX owl: http://www.w3.org/2002/07/owl# SELECT * WHERE {?s owl:sameAs ?o} returns no results. Both at linkedlifedata and at beta.sparql.uniprot.org. Apparently Vassil claims that linkedlifedata deletes all owl:sameAs statements and therefore should not return results. This is silly but ok, if you really do that then you can not replicate my problem at that endpoint. However, we at uniprot do not remove any owl:sameAs statement. We load these into the store and then do not get them back in query answering. Basically all triples with owl:sameAs are silently deleted by the store. A dataset that you can use for experimenting is ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/citations.rdf.gz Which is moderate in size. Regards, Jerven On 12/19/2012 11:38 AM, Barry Norton wrote: Thanks for the clarification - I believe Vassil has answered the question in the mean time, this is down to the way the data is loaded and exposed in LinkedLifeData. OWLIM does index the equivalence in the raw dataset and, under appropriate conditions, will infer the symmetric dual. Barry On 19/12/12 10:24, Jerven Bolleman wrote: Hi Barry, The workbench is not involved its just using directly the sesame api. e.g. ((GraphQuery) query).evaluate(); As the triple is explicit in the dataset. Which means setIncludeInferred true or false should not matter. Testing with the debugger it is unaffected. Tobe clear we load the data using ruleset=none. Which might impact the scenario. The query used to work its just in the latest version that I notice it being gone. Regards, Jerven On 12/19/2012 11:03 AM, Barry Norton wrote: Jerven, using owlim-se-5.3.5689 I just executed: $ wget http://www.uniprot.org/uniprot/P68353.nt $ curl -X POST -H Content-Type:text/turtle -T P68353.nt http://localhost:8080/openrdf-sesame/repositories/uniprot/statements $ curl -d query=DESCRIBE http://purl.uniprot.org/intact/EBI-530932 http://localhost:8080/openrdf-sesame/repositories/uniprot I got: @prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# . @prefix rdfs: http://www.w3.org/2000/01/rdf-schema# . @prefix sesame: http://www.openrdf.org/schema/sesame# .
Re: [Owlim-discussion] {Disarmed} Re: missing owl:sameAs triples
Hi Jerven, if your repository's ruleset is indeed preset to 'empty' and you do not use any inference then please use -Ddisable-sameAs=true JVM option on your service or alter the repository configuration until we release a proper 5.3 release with the fix ... regards, damyan On 12/19/2012 3:52 PM, Jerven Bolleman wrote: The jena warnings are another issue, on the uniprot rdf production site. Not for you to worry about, in any case will be fixed by the next uniprot data release on the 9th of January. Thanks for looking into the owl:sameAs issue. Regards, Jerven On 12/19/2012 02:47 PM, Barry Norton wrote: Further, I converted to NTriples (note, Jena gives a lot of warnings - it's probably worth duplicating this and taking note), and pulled out only the sameAs statements: $ apache-jena-2.7.4/bin/rdfparse citations.rdf citations.rdf.nt $ wc citations.rdf.nt 12693123 217642066 2491240008 citations.rdf.nt $ grep 'sameAs' citations.rdf.nt citations-sameAs.r.nt $ wc citations-sameAs.rdf.nt 789525 3158100 103875838 citations-sameAs.rdf.nt These I upload by themselves: $ curl -X POST -H Content-Type:text/turtle -T citations-sameAs.rdf.nt http://localhost:8080/openrdf-sesame/repositories/citations/statements I received then the same results as below. So I think we can disregard the effect of other data and concentrate purely on what's happening to these sameAs statements. This requires looking into the internals, so I'll leave you in Damyan's capable hands. Barry On 19/12/12 12:42, Barry Norton wrote: Jerven, confirmed I can reproduce your problem thus: $ wget ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/citations.rdf.gz $ gunzip citations.rdf.gz $ curl -X POST -H Content-Type:application/rdf+xml -T citations.rdf http://localhost:8080/openrdf-sesame/repositories/citations/statements And now (unlike with P68353.nt) get only: $ curl -d query=DESCRIBE http://purl.uniprot.org/uniprot/P68359 http://localhost:8080/openrdf-sesame/repositories/citations @prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# . @prefix rdfs: http://www.w3.org/2000/01/rdf-schema# . @prefix sesame: http://www.openrdf.org/schema/sesame# . @prefix owl: http://www.w3.org/2002/07/owl# . @prefix xsd: http://www.w3.org/2001/XMLSchema# . @prefix fn: http://www.w3.org/2005/xpath-functions# . [no results] And: $ curl -H Accept:application/sparql-results+json -d query=PREFIX owl: http://www.w3.org/2002/07/owl# SELECT * WHERE {?s owl:sameAs ?o} LIMIT 10 http://localhost:8080/openrdf-sesame/repositories/citations { head: { vars: [ s, o ] }, results: { bindings: [ ] } [again, nothing] Barry On 19/12/12 11:50, Jerven Bolleman wrote: Hi Barry, No there are two systems which show the same problem. All triples with owl:sameAs as predicate are gone. i.e. PREFIX owl: http://www.w3.org/2002/07/owl# SELECT * WHERE {?s owl:sameAs ?o} returns no results. Both at linkedlifedata and at beta.sparql.uniprot.org. Apparently Vassil claims that linkedlifedata deletes all owl:sameAs statements and therefore should not return results. This is silly but ok, if you really do that then you can not replicate my problem at that endpoint. However, we at uniprot do not remove any owl:sameAs statement. We load these into the store and then do not get them back in query answering. Basically all triples with owl:sameAs are silently deleted by the store. A dataset that you can use for experimenting is ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/citations.rdf.gz Which is moderate in size. Regards, Jerven On 12/19/2012 11:38 AM, Barry Norton wrote: Thanks for the clarification - I believe Vassil has answered the question in the mean time, this is down to the way the data is loaded and exposed in LinkedLifeData. OWLIM does index the equivalence in the raw dataset and, under appropriate conditions, will infer the symmetric dual. Barry On 19/12/12 10:24, Jerven Bolleman wrote: Hi Barry, The workbench is not involved its just using directly the sesame api. e.g. ((GraphQuery) query).evaluate(); As the triple is explicit in the dataset. Which means setIncludeInferred true or false should not matter. Testing with the debugger it is unaffected. Tobe clear we load the data using ruleset=none. Which might impact the scenario. The query used to work its just in the latest version that I notice it being gone. Regards, Jerven On 12/19/2012 11:03 AM, Barry Norton wrote: Jerven, using owlim-se-5.3.5689 I just executed: $ wget http://www.uniprot.org/uniprot/P68353.nt $ curl -X POST -H Content-Type:text/turtle -T P68353.nt http://localhost:8080/openrdf-sesame/repositories/uniprot/statements $ curl -d query=DESCRIBE http://purl.uniprot.org/intact/EBI-530932 http://localhost:8080/openrdf-sesame/repositories/uniprot I got: @prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# . @prefix rdfs: