Query causes a StackOverflowError

2014-03-16 Thread Adam Retter
Hi there,

Firstly I would just like to say that whilst we have only been using
Elda and Fuseki for about a year, we have until now been really very
happy with them. Excellent stuff :-)

Unfortunately, today, we have a query that is generated by Elda and
POST'ed to Fuseki (https://github.com/epimorphics/elda). The query is
about 1.4MB!

Unfortunately this query causes Fuseki to throw a
java.lang.StackOverflowError. The only other post I found on the
mailing list which looks similar was from 2011
http://markmail.org/message/pwzdrcn7lnkqra35 but there was no follow
up to it.

Unfortunately, we really need to solve this issue quickly. I am not
opposed to getting my hands dirty in the Jena code base if someone can
tell me what needs to be done, and support me when I have questions.
But hopefully there is some sort of quick workaround? So then, what
are my options chaps?

You may access the stack trace here:
https://dl.dropboxusercontent.com/u/35135948/StackOverflowError.txt
and the query that caused the exception here:
https://dl.dropboxusercontent.com/u/35135948/fuseki-query.txt
Sorry for the use of DropBox, book the Apache mailing list manager
kept rejecting my post as it was too large otherwise.

Thanks Adam.

-- 
Adam Retter

skype: adam.retter
tweet: adamretter
http://www.adamretter.org.uk


Re: Query causes a StackOverflowError

2014-03-16 Thread Andy Seaborne

Hi Adam,

On 16/03/14 18:58, Adam Retter wrote:

Hi there,

Firstly I would just like to say that whilst we have only been using
Elda and Fuseki for about a year, we have until now been really very
happy with them. Excellent stuff :-)


Which versions?

And is this using sub-SELECTs enabled in Elda?

IIRC that will replace the nearly 10,000 cases of UNION with the SELECT 
that generated it in the first place, which is much (much) shorter.




Unfortunately, today, we have a query that is generated by Elda and
POST'ed to Fuseki (https://github.com/epimorphics/elda). The query is
about 1.4MB!

Unfortunately this query causes Fuseki to throw a
java.lang.StackOverflowError. The only other post I found on the
mailing list which looks similar was from 2011
http://markmail.org/message/pwzdrcn7lnkqra35 but there was no follow
up to it.


Have you tried increasing the stack?  What happens?


Unfortunately, we really need to solve this issue quickly. I am not
opposed to getting my hands dirty in the Jena code base if someone can
tell me what needs to be done, and support me when I have questions.
But hopefully there is some sort of quick workaround? So then, what
are my options chaps?

You may access the stack trace here:
https://dl.dropboxusercontent.com/u/35135948/StackOverflowError.txt
and the query that caused the exception here:
https://dl.dropboxusercontent.com/u/35135948/fuseki-query.txt
Sorry for the use of DropBox, book the Apache mailing list manager
kept rejecting my post as it was too large otherwise.

Thanks Adam.



Andy


Re: Query causes a StackOverflowError

2014-03-16 Thread Adam Retter
Thanks for the quick reply Andy -

On 16 March 2014 20:48, Andy Seaborne a...@apache.org wrote:
 Hi Adam,


 On 16/03/14 18:58, Adam Retter wrote:

 Hi there,

 Firstly I would just like to say that whilst we have only been using
 Elda and Fuseki for about a year, we have until now been really very
 happy with them. Excellent stuff :-)


 Which versions?

Fuseki 0.2.6, I did check the release notes and commit history for
newer versions, but did not see any detail of bug fixes that might
address this. Is it possible that it has been fixed and I just missed
it?

I will have to check the version of Elda tomorrow when I am in the
office and get back to you.

 And is this using sub-SELECTs enabled in Elda?

 IIRC that will replace the nearly 10,000 cases of UNION with the SELECT that
 generated it in the first place, which is much (much) shorter.

I will check tomorrow and get back to you. If we are not doing that,
then that sounds very promising.



 Unfortunately, today, we have a query that is generated by Elda and
 POST'ed to Fuseki (https://github.com/epimorphics/elda). The query is
 about 1.4MB!

 Unfortunately this query causes Fuseki to throw a
 java.lang.StackOverflowError. The only other post I found on the
 mailing list which looks similar was from 2011
 http://markmail.org/message/pwzdrcn7lnkqra35 but there was no follow
 up to it.


 Have you tried increasing the stack?  What happens?

I can certainly give that a try tomorrow as well. However, I normally
try to avoid doing this, and prefer to fix the problem at it's source,
otherwise I may just delays the inevitable to sometime in the future
when I run with a larger query ;-)


 Unfortunately, we really need to solve this issue quickly. I am not
 opposed to getting my hands dirty in the Jena code base if someone can
 tell me what needs to be done, and support me when I have questions.
 But hopefully there is some sort of quick workaround? So then, what
 are my options chaps?

 You may access the stack trace here:
 https://dl.dropboxusercontent.com/u/35135948/StackOverflowError.txt
 and the query that caused the exception here:
 https://dl.dropboxusercontent.com/u/35135948/fuseki-query.txt
 Sorry for the use of DropBox, book the Apache mailing list manager
 kept rejecting my post as it was too large otherwise.

 Thanks Adam.


 Andy



-- 
Adam Retter

skype: adam.retter
tweet: adamretter
http://www.adamretter.org.uk


Re: Query causes a StackOverflowError

2014-03-16 Thread Andy Seaborne

On 16/03/14 21:02, Adam Retter wrote:

Thanks for the quick reply Andy -

On 16 March 2014 20:48, Andy Seaborne a...@apache.org wrote:

Hi Adam,


On 16/03/14 18:58, Adam Retter wrote:


Hi there,

Firstly I would just like to say that whilst we have only been using
Elda and Fuseki for about a year, we have until now been really very
happy with them. Excellent stuff :-)



Which versions?


Fuseki 0.2.6, I did check the release notes and commit history for
newer versions, but did not see any detail of bug fixes that might
address this. Is it possible that it has been fixed and I just missed
it?


It's nothing to do with Fuseki, which is just the protocol handling. 
The stacktrace is in algebra generation because


{P1} UNION {P2} UNION {P3}
is

(union
  {P1}
  (union
{P2}
(union
  {P3}
 ...

It hasn't even got to the optimizer.

But then sending and parsing 1.4M queries is never going to be fast.


I will have to check the version of Elda tomorrow when I am in the
office and get back to you.


And is this using sub-SELECTs enabled in Elda?

IIRC that will replace the nearly 10,000 cases of UNION with the SELECT that
generated it in the first place, which is much (much) shorter.


I will check tomorrow and get back to you. If we are not doing that,
then that sounds very promising.





Unfortunately, today, we have a query that is generated by Elda and
POST'ed to Fuseki (https://github.com/epimorphics/elda). The query is
about 1.4MB!

Unfortunately this query causes Fuseki to throw a
java.lang.StackOverflowError. The only other post I found on the
mailing list which looks similar was from 2011
http://markmail.org/message/pwzdrcn7lnkqra35 but there was no follow
up to it.



Have you tried increasing the stack?  What happens?


I can certainly give that a try tomorrow as well. However, I normally
try to avoid doing this, and prefer to fix the problem at it's source,
otherwise I may just delays the inevitable to sometime in the future
when I run with a larger query ;-)


It is more information with which to debug.





Unfortunately, we really need to solve this issue quickly. I am not
opposed to getting my hands dirty in the Jena code base if someone can
tell me what needs to be done, and support me when I have questions.
But hopefully there is some sort of quick workaround? So then, what
are my options chaps?

You may access the stack trace here:
https://dl.dropboxusercontent.com/u/35135948/StackOverflowError.txt
and the query that caused the exception here:
https://dl.dropboxusercontent.com/u/35135948/fuseki-query.txt
Sorry for the use of DropBox, book the Apache mailing list manager
kept rejecting my post as it was too large otherwise.

Thanks Adam.



 Andy








Re: Jena + pellet Reasoner

2014-03-16 Thread Adeeb Noor
Thanks a lot Miguel.

My TDB is around 17G and Jena rule engine will take for ever to do the
inferring.

Thanks


On Sat, Mar 15, 2014 at 5:53 AM, Miguel Bento Alves
mbentoal...@gmail.comwrote:

 In the best of my knowledge, you canĀ¹t do it with OWL or OWL2(pellet
 reasoner). You need something like rules or spin-rules.

 Miguel

 On 15/03/14 03:14, Adeeb Noor adeeb.n...@colorado.edu wrote:

 Would someone please help me with the question
 
 Thanks in advance
 
 
 On Fri, Mar 14, 2014 at 3:04 PM, Adeeb Noor adeeb.n...@colorado.edu
 wrote:
 
 
  Need your advice below please when you have a chance .
 
  Miguel though I need to do it with RuleEngine but in fact I want to do
 the
  work with pellet reasoner.
 
  thanks and sorry for bothering.
 
 
  -- Forwarded message --
  From: Adeeb Noor adeeb.n...@colorado.edu
  Date: Fri, Mar 14, 2014 at 2:08 AM
  Subject: Jena + pellet Reasoner
  To: users@jena.apache.org users@jena.apache.org
 
 
  Hello everyone:
 
  I have been struggling a lot with a problem that I did not find a
 solution
  for, so hopefully guys can guide me or help me with it.
 
  I have my data (rdfs) store in jena tdb as model and my owl (schema)
 using
  protoge.
 
  Here is the code to merge data and schema:
 
  System.out.println(creting infeer dataset );
  Dataset dataset = TDBFactory.createDataset(data.infereedTDB);
 
  System.out.println(creting OntModel );
  OntModel Infmodel =
  ModelFactory.createOntologyModel(PelletReasonerFactory.THE_SPEC,
  dataset.getNamedModel(this.URL));
 
  System.out.println(adding schema (OWL) to OntModel);
  Infmodel.add(this.owl);
 
  System.out.println(adding data (RDF) to OntModel );
  Infmodel.add(data.tdb);
 
  System.out.println(creting ModelExtractor );
  ModelExtractor ext = new ModelExtractor(Infmodel);
 
  dataset.replaceNamedModel(this.URL, ext.extractModel());
 
  System.out.println(saving infead model);
  Infmodel.close();
  System.out.println(closing infeed dataset);
  dataset.close();
 
  So I have the ability to store my inferred data into new tdb and to
 reason
  or build any rule based on Literal values . For example: this is one
 triple
  that has UMLS_type as a property and I can group all subjects with two
 or
  more UMLS_types for instance.
 
  ddidd:C0007586 | ddids:label
 | Cell Cycle
  ddidd:C0007586 | http://www.w3.org/1999/02/22-rdf-syntax-ns#type |
  ddids:Pathway
  ddidd:C0007586 | ddids:UMLS_type
   | T043
  ddidd:C0007586 |
  ddids:x-kegg.pathway | 
  http://identifiers.org/kegg.pathway/hsa04110
 
  However, as you can see from the triple above I have x-kegg.pathway  as
 an
  external uri. By default protoge takes it to be object property. What I
  cannot do is to write a rule for example to group all subject under same
  x-kegg.pathway number since it is external URI.   For example, I want to
  create a class called sameKEGG that its members has 
  http://identifiers.org/kegg.pathway/hsa04110  value for example. BTW,
 I
  can do such a thing easily by SPARQL but I would love to use the
 reasoner
  to make complex rules.
 
  I have been struggling with this for awhile and I frankly appreciate any
  comments or feedback.
 
  --
  Adeeb Noor
  Ph.D. Candidate
  Dept of Computer Science
  University of Colorado at Boulder
  Cell: 571-484-3303
  Email: adeeb.n...@colorado.edu
 
 
 
  --
  Adeeb Noor
  Ph.D. Candidate
  Dept of Computer Science
  University of Colorado at Boulder
  Cell: 571-484-3303
  Email: adeeb.n...@colorado.edu
 
 
 
 
 --
 Adeeb Noor
 Ph.D. Candidate
 Dept of Computer Science
 University of Colorado at Boulder
 Cell: 571-484-3303
 Email: adeeb.n...@colorado.edu





-- 
Adeeb Noor
Ph.D. Candidate
Dept of Computer Science
University of Colorado at Boulder
Cell: 571-484-3303
Email: adeeb.n...@colorado.edu