Query causes a StackOverflowError
Hi there, Firstly I would just like to say that whilst we have only been using Elda and Fuseki for about a year, we have until now been really very happy with them. Excellent stuff :-) Unfortunately, today, we have a query that is generated by Elda and POST'ed to Fuseki (https://github.com/epimorphics/elda). The query is about 1.4MB! Unfortunately this query causes Fuseki to throw a java.lang.StackOverflowError. The only other post I found on the mailing list which looks similar was from 2011 http://markmail.org/message/pwzdrcn7lnkqra35 but there was no follow up to it. Unfortunately, we really need to solve this issue quickly. I am not opposed to getting my hands dirty in the Jena code base if someone can tell me what needs to be done, and support me when I have questions. But hopefully there is some sort of quick workaround? So then, what are my options chaps? You may access the stack trace here: https://dl.dropboxusercontent.com/u/35135948/StackOverflowError.txt and the query that caused the exception here: https://dl.dropboxusercontent.com/u/35135948/fuseki-query.txt Sorry for the use of DropBox, book the Apache mailing list manager kept rejecting my post as it was too large otherwise. Thanks Adam. -- Adam Retter skype: adam.retter tweet: adamretter http://www.adamretter.org.uk
Re: Query causes a StackOverflowError
Hi Adam, On 16/03/14 18:58, Adam Retter wrote: Hi there, Firstly I would just like to say that whilst we have only been using Elda and Fuseki for about a year, we have until now been really very happy with them. Excellent stuff :-) Which versions? And is this using sub-SELECTs enabled in Elda? IIRC that will replace the nearly 10,000 cases of UNION with the SELECT that generated it in the first place, which is much (much) shorter. Unfortunately, today, we have a query that is generated by Elda and POST'ed to Fuseki (https://github.com/epimorphics/elda). The query is about 1.4MB! Unfortunately this query causes Fuseki to throw a java.lang.StackOverflowError. The only other post I found on the mailing list which looks similar was from 2011 http://markmail.org/message/pwzdrcn7lnkqra35 but there was no follow up to it. Have you tried increasing the stack? What happens? Unfortunately, we really need to solve this issue quickly. I am not opposed to getting my hands dirty in the Jena code base if someone can tell me what needs to be done, and support me when I have questions. But hopefully there is some sort of quick workaround? So then, what are my options chaps? You may access the stack trace here: https://dl.dropboxusercontent.com/u/35135948/StackOverflowError.txt and the query that caused the exception here: https://dl.dropboxusercontent.com/u/35135948/fuseki-query.txt Sorry for the use of DropBox, book the Apache mailing list manager kept rejecting my post as it was too large otherwise. Thanks Adam. Andy
Re: Query causes a StackOverflowError
Thanks for the quick reply Andy - On 16 March 2014 20:48, Andy Seaborne a...@apache.org wrote: Hi Adam, On 16/03/14 18:58, Adam Retter wrote: Hi there, Firstly I would just like to say that whilst we have only been using Elda and Fuseki for about a year, we have until now been really very happy with them. Excellent stuff :-) Which versions? Fuseki 0.2.6, I did check the release notes and commit history for newer versions, but did not see any detail of bug fixes that might address this. Is it possible that it has been fixed and I just missed it? I will have to check the version of Elda tomorrow when I am in the office and get back to you. And is this using sub-SELECTs enabled in Elda? IIRC that will replace the nearly 10,000 cases of UNION with the SELECT that generated it in the first place, which is much (much) shorter. I will check tomorrow and get back to you. If we are not doing that, then that sounds very promising. Unfortunately, today, we have a query that is generated by Elda and POST'ed to Fuseki (https://github.com/epimorphics/elda). The query is about 1.4MB! Unfortunately this query causes Fuseki to throw a java.lang.StackOverflowError. The only other post I found on the mailing list which looks similar was from 2011 http://markmail.org/message/pwzdrcn7lnkqra35 but there was no follow up to it. Have you tried increasing the stack? What happens? I can certainly give that a try tomorrow as well. However, I normally try to avoid doing this, and prefer to fix the problem at it's source, otherwise I may just delays the inevitable to sometime in the future when I run with a larger query ;-) Unfortunately, we really need to solve this issue quickly. I am not opposed to getting my hands dirty in the Jena code base if someone can tell me what needs to be done, and support me when I have questions. But hopefully there is some sort of quick workaround? So then, what are my options chaps? You may access the stack trace here: https://dl.dropboxusercontent.com/u/35135948/StackOverflowError.txt and the query that caused the exception here: https://dl.dropboxusercontent.com/u/35135948/fuseki-query.txt Sorry for the use of DropBox, book the Apache mailing list manager kept rejecting my post as it was too large otherwise. Thanks Adam. Andy -- Adam Retter skype: adam.retter tweet: adamretter http://www.adamretter.org.uk
Re: Query causes a StackOverflowError
On 16/03/14 21:02, Adam Retter wrote: Thanks for the quick reply Andy - On 16 March 2014 20:48, Andy Seaborne a...@apache.org wrote: Hi Adam, On 16/03/14 18:58, Adam Retter wrote: Hi there, Firstly I would just like to say that whilst we have only been using Elda and Fuseki for about a year, we have until now been really very happy with them. Excellent stuff :-) Which versions? Fuseki 0.2.6, I did check the release notes and commit history for newer versions, but did not see any detail of bug fixes that might address this. Is it possible that it has been fixed and I just missed it? It's nothing to do with Fuseki, which is just the protocol handling. The stacktrace is in algebra generation because {P1} UNION {P2} UNION {P3} is (union {P1} (union {P2} (union {P3} ... It hasn't even got to the optimizer. But then sending and parsing 1.4M queries is never going to be fast. I will have to check the version of Elda tomorrow when I am in the office and get back to you. And is this using sub-SELECTs enabled in Elda? IIRC that will replace the nearly 10,000 cases of UNION with the SELECT that generated it in the first place, which is much (much) shorter. I will check tomorrow and get back to you. If we are not doing that, then that sounds very promising. Unfortunately, today, we have a query that is generated by Elda and POST'ed to Fuseki (https://github.com/epimorphics/elda). The query is about 1.4MB! Unfortunately this query causes Fuseki to throw a java.lang.StackOverflowError. The only other post I found on the mailing list which looks similar was from 2011 http://markmail.org/message/pwzdrcn7lnkqra35 but there was no follow up to it. Have you tried increasing the stack? What happens? I can certainly give that a try tomorrow as well. However, I normally try to avoid doing this, and prefer to fix the problem at it's source, otherwise I may just delays the inevitable to sometime in the future when I run with a larger query ;-) It is more information with which to debug. Unfortunately, we really need to solve this issue quickly. I am not opposed to getting my hands dirty in the Jena code base if someone can tell me what needs to be done, and support me when I have questions. But hopefully there is some sort of quick workaround? So then, what are my options chaps? You may access the stack trace here: https://dl.dropboxusercontent.com/u/35135948/StackOverflowError.txt and the query that caused the exception here: https://dl.dropboxusercontent.com/u/35135948/fuseki-query.txt Sorry for the use of DropBox, book the Apache mailing list manager kept rejecting my post as it was too large otherwise. Thanks Adam. Andy
Re: Jena + pellet Reasoner
Thanks a lot Miguel. My TDB is around 17G and Jena rule engine will take for ever to do the inferring. Thanks On Sat, Mar 15, 2014 at 5:53 AM, Miguel Bento Alves mbentoal...@gmail.comwrote: In the best of my knowledge, you canĀ¹t do it with OWL or OWL2(pellet reasoner). You need something like rules or spin-rules. Miguel On 15/03/14 03:14, Adeeb Noor adeeb.n...@colorado.edu wrote: Would someone please help me with the question Thanks in advance On Fri, Mar 14, 2014 at 3:04 PM, Adeeb Noor adeeb.n...@colorado.edu wrote: Need your advice below please when you have a chance . Miguel though I need to do it with RuleEngine but in fact I want to do the work with pellet reasoner. thanks and sorry for bothering. -- Forwarded message -- From: Adeeb Noor adeeb.n...@colorado.edu Date: Fri, Mar 14, 2014 at 2:08 AM Subject: Jena + pellet Reasoner To: users@jena.apache.org users@jena.apache.org Hello everyone: I have been struggling a lot with a problem that I did not find a solution for, so hopefully guys can guide me or help me with it. I have my data (rdfs) store in jena tdb as model and my owl (schema) using protoge. Here is the code to merge data and schema: System.out.println(creting infeer dataset ); Dataset dataset = TDBFactory.createDataset(data.infereedTDB); System.out.println(creting OntModel ); OntModel Infmodel = ModelFactory.createOntologyModel(PelletReasonerFactory.THE_SPEC, dataset.getNamedModel(this.URL)); System.out.println(adding schema (OWL) to OntModel); Infmodel.add(this.owl); System.out.println(adding data (RDF) to OntModel ); Infmodel.add(data.tdb); System.out.println(creting ModelExtractor ); ModelExtractor ext = new ModelExtractor(Infmodel); dataset.replaceNamedModel(this.URL, ext.extractModel()); System.out.println(saving infead model); Infmodel.close(); System.out.println(closing infeed dataset); dataset.close(); So I have the ability to store my inferred data into new tdb and to reason or build any rule based on Literal values . For example: this is one triple that has UMLS_type as a property and I can group all subjects with two or more UMLS_types for instance. ddidd:C0007586 | ddids:label | Cell Cycle ddidd:C0007586 | http://www.w3.org/1999/02/22-rdf-syntax-ns#type | ddids:Pathway ddidd:C0007586 | ddids:UMLS_type | T043 ddidd:C0007586 | ddids:x-kegg.pathway | http://identifiers.org/kegg.pathway/hsa04110 However, as you can see from the triple above I have x-kegg.pathway as an external uri. By default protoge takes it to be object property. What I cannot do is to write a rule for example to group all subject under same x-kegg.pathway number since it is external URI. For example, I want to create a class called sameKEGG that its members has http://identifiers.org/kegg.pathway/hsa04110 value for example. BTW, I can do such a thing easily by SPARQL but I would love to use the reasoner to make complex rules. I have been struggling with this for awhile and I frankly appreciate any comments or feedback. -- Adeeb Noor Ph.D. Candidate Dept of Computer Science University of Colorado at Boulder Cell: 571-484-3303 Email: adeeb.n...@colorado.edu -- Adeeb Noor Ph.D. Candidate Dept of Computer Science University of Colorado at Boulder Cell: 571-484-3303 Email: adeeb.n...@colorado.edu -- Adeeb Noor Ph.D. Candidate Dept of Computer Science University of Colorado at Boulder Cell: 571-484-3303 Email: adeeb.n...@colorado.edu -- Adeeb Noor Ph.D. Candidate Dept of Computer Science University of Colorado at Boulder Cell: 571-484-3303 Email: adeeb.n...@colorado.edu