Re: Query causes a StackOverflowError

2014-03-18 Thread Adam Retter
Well unfortunately we are not there yet, we are having to re-engineer
quite a lot of things around the re-design of our query.

In the meantime we have tried to roll-back and are considering
increasing the stack size as Andy suggested. However, my fear is that
this will not actually help us at all. Although it was not clear in my
original email, because I cut out some of the stacktrace to save
bytes, my reasoning is thus:

Our query contains 9,669 unions, however in the stack trace we see
270,000+ frames which are mostly the following repeating over and over
again:

at com.hp.hpl.jena.sparql.algebra.op.OpUnion.visit(OpUnion.java:49)
at 
com.hp.hpl.jena.sparql.algebra.OpWalker$WalkerVisitor.visit2(OpWalker.java:102)
at com.hp.hpl.jena.sparql.algebra.OpVisitorByType.visit(OpVisitorByType.java:74)

as there is such a disparity between the number of unions we actually
want to perform and the number of stack frames for OpUnion, I am
rather inclined to believe at the moment that this is a recursion bug.
My concern is that no matter how large I grow the stack size, it will
continue to overflow. Would that be a correct assumption? ...or is it
correct that I should have 270,000+ OpUnion frames for a union of just
9,669 resources?

Thanks

On 17 March 2014 20:55, Andy Seaborne a...@apache.org wrote:
 On 17/03/14 18:17, Adam Retter wrote:

 Thanks Guys,


 We did try using the:

 yourSPARQLEndpoint elda:supportsNestedSelects true.

 Although I think we found that the option was actually (non-plural):

 yourSPARQLEndpoint elda:supportsNestedSelect true.

 However, whilst Elda reported that it did indeed support the
 NestedSelect, it continued to create that absolutely massive union for
 our query.


 Bizarre - probably one for the ELDA mailing list
   linked-data-api-discuss AT googlegroups.com



 The good news is my colleague Rob was able to change Elda's config
 file so that it generated a much simpler query for us. So we have now
 moved past that one. If anyone wants more info on this, please say,
 and I will ask him to post some detail.


 Yes - it would be interesting to know what changes you made, and their
 effect.

 Andy



 Thanks for all your help guys.



 On 17 March 2014 06:51, Chris_Dollin ehog.he...@gmail.com wrote:

 On Sunday, March 16, 2014 06:58:29 PM Adam Retter wrote:

 Unfortunately, today, we have a query that is generated by Elda and
 POST'ed to Fuseki (https://github.com/epimorphics/elda). The query is
 about 1.4MB!

 Unfortunately this query causes Fuseki to throw a
 java.lang.StackOverflowError. The only other post I found on the
 mailing list which looks similar was from 2011
 http://markmail.org/message/pwzdrcn7lnkqra35 but there was no follow
 up to it.


 Concur with Andy -- you need to enable the nested select option
 (which has been in Elda for a long time, since we hit exactly the same
 issue of ENORMOUS queries you have ...)

 Add to your configuration:

  yourSPARQLEndpoint elda:supportsNestedSelects true.

 where elda has been defined with

@prefix elda:
 http://www.epimorphics.com/vocabularies/lda# .

 This is documented at


 http://epimorphics.github.io/elda/docs/E1.2.29/reference.html#section-1015

 (which is the .29 documentation, but this aspect hasn't changed since it
 was
 introduced, and I see a couple of typos there which I will fix before .30
 comes
 out Later This Week All Being Well.)

 Chris










-- 
Adam Retter

skype: adam.retter
tweet: adamretter
http://www.adamretter.org.uk


Re: Re: Query causes a StackOverflowError

2014-03-18 Thread Chris Dollin
On Monday, March 17, 2014 06:17:10 PM Adam Retter wrote:
 
 We did try using the:
 
 yourSPARQLEndpoint elda:supportsNestedSelects true.
 
 Although I think we found that the option was actually (non-plural):
 
 yourSPARQLEndpoint elda:supportsNestedSelect true.

Oops, sorry.

 However, whilst Elda reported that it did indeed support the
 NestedSelect, it continued to create that absolutely massive union for
 our query.
 
 The good news is my colleague Rob was able to change Elda's config
 file so that it generated a much simpler query for us. So we have now
 moved past that one. If anyone wants more info on this, please say,
 and I will ask him to post some detail.

I think this is a problem with views that ask for labels on the viewed
items (ie based on the view called all, referenced with 
api:labelledDescribeViewer). The code that requests the object
labels doesn't have a fallback if there are lotsandlots of selected
items. 

If the fix Rob made was to change from labelledDescribeViewer to
a simpler viewer like api:describerViewer then that's a confirmation
and I'll add an issue to the issues list. 

Chris

-- 
A facility for quotation covers the absence of original thought./Gaudy Night/

Epimorphics Ltd, http://www.epimorphics.com
Registered address: Court Lodge, 105 High Street, Portishead, Bristol BS20 6PT
Epimorphics Ltd. is a limited company registered in England (number 7016688)



Re: Re: Query causes a StackOverflowError

2014-03-18 Thread Rob Walpole
On Tue, Mar 18, 2014 at 11:59 AM, Chris Dollin chris.dol...@epimorphics.com
 wrote:

 On Monday, March 17, 2014 06:17:10 PM Adam Retter wrote:

  We did try using the:
 
  yourSPARQLEndpoint elda:supportsNestedSelects true.
 
  Although I think we found that the option was actually (non-plural):
 
  yourSPARQLEndpoint elda:supportsNestedSelect true.

 Oops, sorry.


Actually what we used in the end looked like:-

http://localhost:3030/catalogue/query elda:supportsNestedSelect
true^^xsd:string .

Fuseki then said it was using nested selects (in the 500 error) - but it
didn't make any difference to the problem.



 If the fix Rob made was to change from labelledDescribeViewer to
 a simpler viewer like api:describerViewer then that's a confirmation
 and I'll add an issue to the issues list.


The thing is we do actually want the labels from all the viewed items as we
use these elsewhere. As the generated union query seemed to be the root of
the problem we switched to using our Elda construct extension so we could
have an endpoint like this:

spec:record-list a apivc:ListEndpoint
; apivc:uriTemplate /record-list/{uuid}
; apivc:variable [apivc:name uuid; apivc:type xsd:string]
; tna:construct 
CONSTRUCT { ?member rdfs:label ?label . }
  WHERE {
?recordList dcterms:identifier ?uuid ;
dri:recordListMember ?member .
?member rdfs:label ?label .
}

.

The query takes a while to run (there are approx 10,000
dri:recordListMember entries) but we get a result back eventually with no
stack overflow problem.

Rob
-- 

Rob Walpole
Email robkwalp...@gmail.com
Tel. +44 (0)7969 869881
Skype: RobertWalpolehttp://www.linkedin.com/in/robwalpole


Re: Query causes a StackOverflowError

2014-03-17 Thread Chris_Dollin
On Sunday, March 16, 2014 06:58:29 PM Adam Retter wrote:

 Unfortunately, today, we have a query that is generated by Elda and
 POST'ed to Fuseki (https://github.com/epimorphics/elda). The query is
 about 1.4MB!
 
 Unfortunately this query causes Fuseki to throw a
 java.lang.StackOverflowError. The only other post I found on the
 mailing list which looks similar was from 2011
 http://markmail.org/message/pwzdrcn7lnkqra35 but there was no follow
 up to it.

Concur with Andy -- you need to enable the nested select option 
(which has been in Elda for a long time, since we hit exactly the same
issue of ENORMOUS queries you have ...) 

Add to your configuration:

yourSPARQLEndpoint elda:supportsNestedSelects true.

where elda has been defined with

  @prefix elda:   http://www.epimorphics.com/vocabularies/lda# .

This is documented at

http://epimorphics.github.io/elda/docs/E1.2.29/reference.html#section-1015

(which is the .29 documentation, but this aspect hasn't changed since it was
introduced, and I see a couple of typos there which I will fix before .30 comes
out Later This Week All Being Well.)

Chris





Re: Query causes a StackOverflowError

2014-03-17 Thread Adam Retter
Thanks Guys,


We did try using the:

yourSPARQLEndpoint elda:supportsNestedSelects true.

Although I think we found that the option was actually (non-plural):

yourSPARQLEndpoint elda:supportsNestedSelect true.

However, whilst Elda reported that it did indeed support the
NestedSelect, it continued to create that absolutely massive union for
our query.

The good news is my colleague Rob was able to change Elda's config
file so that it generated a much simpler query for us. So we have now
moved past that one. If anyone wants more info on this, please say,
and I will ask him to post some detail.

Thanks for all your help guys.



On 17 March 2014 06:51, Chris_Dollin ehog.he...@gmail.com wrote:
 On Sunday, March 16, 2014 06:58:29 PM Adam Retter wrote:

 Unfortunately, today, we have a query that is generated by Elda and
 POST'ed to Fuseki (https://github.com/epimorphics/elda). The query is
 about 1.4MB!

 Unfortunately this query causes Fuseki to throw a
 java.lang.StackOverflowError. The only other post I found on the
 mailing list which looks similar was from 2011
 http://markmail.org/message/pwzdrcn7lnkqra35 but there was no follow
 up to it.

 Concur with Andy -- you need to enable the nested select option
 (which has been in Elda for a long time, since we hit exactly the same
 issue of ENORMOUS queries you have ...)

 Add to your configuration:

 yourSPARQLEndpoint elda:supportsNestedSelects true.

 where elda has been defined with

   @prefix elda:   http://www.epimorphics.com/vocabularies/lda# .

 This is documented at

 http://epimorphics.github.io/elda/docs/E1.2.29/reference.html#section-1015

 (which is the .29 documentation, but this aspect hasn't changed since it was
 introduced, and I see a couple of typos there which I will fix before .30 
 comes
 out Later This Week All Being Well.)

 Chris






-- 
Adam Retter

skype: adam.retter
tweet: adamretter
http://www.adamretter.org.uk


Re: Query causes a StackOverflowError

2014-03-17 Thread Andy Seaborne

On 17/03/14 18:17, Adam Retter wrote:

Thanks Guys,


We did try using the:

yourSPARQLEndpoint elda:supportsNestedSelects true.

Although I think we found that the option was actually (non-plural):

yourSPARQLEndpoint elda:supportsNestedSelect true.

However, whilst Elda reported that it did indeed support the
NestedSelect, it continued to create that absolutely massive union for
our query.


Bizarre - probably one for the ELDA mailing list
  linked-data-api-discuss AT googlegroups.com



The good news is my colleague Rob was able to change Elda's config
file so that it generated a much simpler query for us. So we have now
moved past that one. If anyone wants more info on this, please say,
and I will ask him to post some detail.


Yes - it would be interesting to know what changes you made, and their 
effect.


Andy



Thanks for all your help guys.



On 17 March 2014 06:51, Chris_Dollin ehog.he...@gmail.com wrote:

On Sunday, March 16, 2014 06:58:29 PM Adam Retter wrote:


Unfortunately, today, we have a query that is generated by Elda and
POST'ed to Fuseki (https://github.com/epimorphics/elda). The query is
about 1.4MB!

Unfortunately this query causes Fuseki to throw a
java.lang.StackOverflowError. The only other post I found on the
mailing list which looks similar was from 2011
http://markmail.org/message/pwzdrcn7lnkqra35 but there was no follow
up to it.


Concur with Andy -- you need to enable the nested select option
(which has been in Elda for a long time, since we hit exactly the same
issue of ENORMOUS queries you have ...)

Add to your configuration:

 yourSPARQLEndpoint elda:supportsNestedSelects true.

where elda has been defined with

   @prefix elda:   http://www.epimorphics.com/vocabularies/lda# .

This is documented at

http://epimorphics.github.io/elda/docs/E1.2.29/reference.html#section-1015

(which is the .29 documentation, but this aspect hasn't changed since it was
introduced, and I see a couple of typos there which I will fix before .30 comes
out Later This Week All Being Well.)

Chris











Re: Query causes a StackOverflowError

2014-03-16 Thread Andy Seaborne

Hi Adam,

On 16/03/14 18:58, Adam Retter wrote:

Hi there,

Firstly I would just like to say that whilst we have only been using
Elda and Fuseki for about a year, we have until now been really very
happy with them. Excellent stuff :-)


Which versions?

And is this using sub-SELECTs enabled in Elda?

IIRC that will replace the nearly 10,000 cases of UNION with the SELECT 
that generated it in the first place, which is much (much) shorter.




Unfortunately, today, we have a query that is generated by Elda and
POST'ed to Fuseki (https://github.com/epimorphics/elda). The query is
about 1.4MB!

Unfortunately this query causes Fuseki to throw a
java.lang.StackOverflowError. The only other post I found on the
mailing list which looks similar was from 2011
http://markmail.org/message/pwzdrcn7lnkqra35 but there was no follow
up to it.


Have you tried increasing the stack?  What happens?


Unfortunately, we really need to solve this issue quickly. I am not
opposed to getting my hands dirty in the Jena code base if someone can
tell me what needs to be done, and support me when I have questions.
But hopefully there is some sort of quick workaround? So then, what
are my options chaps?

You may access the stack trace here:
https://dl.dropboxusercontent.com/u/35135948/StackOverflowError.txt
and the query that caused the exception here:
https://dl.dropboxusercontent.com/u/35135948/fuseki-query.txt
Sorry for the use of DropBox, book the Apache mailing list manager
kept rejecting my post as it was too large otherwise.

Thanks Adam.



Andy


Re: Query causes a StackOverflowError

2014-03-16 Thread Adam Retter
Thanks for the quick reply Andy -

On 16 March 2014 20:48, Andy Seaborne a...@apache.org wrote:
 Hi Adam,


 On 16/03/14 18:58, Adam Retter wrote:

 Hi there,

 Firstly I would just like to say that whilst we have only been using
 Elda and Fuseki for about a year, we have until now been really very
 happy with them. Excellent stuff :-)


 Which versions?

Fuseki 0.2.6, I did check the release notes and commit history for
newer versions, but did not see any detail of bug fixes that might
address this. Is it possible that it has been fixed and I just missed
it?

I will have to check the version of Elda tomorrow when I am in the
office and get back to you.

 And is this using sub-SELECTs enabled in Elda?

 IIRC that will replace the nearly 10,000 cases of UNION with the SELECT that
 generated it in the first place, which is much (much) shorter.

I will check tomorrow and get back to you. If we are not doing that,
then that sounds very promising.



 Unfortunately, today, we have a query that is generated by Elda and
 POST'ed to Fuseki (https://github.com/epimorphics/elda). The query is
 about 1.4MB!

 Unfortunately this query causes Fuseki to throw a
 java.lang.StackOverflowError. The only other post I found on the
 mailing list which looks similar was from 2011
 http://markmail.org/message/pwzdrcn7lnkqra35 but there was no follow
 up to it.


 Have you tried increasing the stack?  What happens?

I can certainly give that a try tomorrow as well. However, I normally
try to avoid doing this, and prefer to fix the problem at it's source,
otherwise I may just delays the inevitable to sometime in the future
when I run with a larger query ;-)


 Unfortunately, we really need to solve this issue quickly. I am not
 opposed to getting my hands dirty in the Jena code base if someone can
 tell me what needs to be done, and support me when I have questions.
 But hopefully there is some sort of quick workaround? So then, what
 are my options chaps?

 You may access the stack trace here:
 https://dl.dropboxusercontent.com/u/35135948/StackOverflowError.txt
 and the query that caused the exception here:
 https://dl.dropboxusercontent.com/u/35135948/fuseki-query.txt
 Sorry for the use of DropBox, book the Apache mailing list manager
 kept rejecting my post as it was too large otherwise.

 Thanks Adam.


 Andy



-- 
Adam Retter

skype: adam.retter
tweet: adamretter
http://www.adamretter.org.uk


Re: Query causes a StackOverflowError

2014-03-16 Thread Andy Seaborne

On 16/03/14 21:02, Adam Retter wrote:

Thanks for the quick reply Andy -

On 16 March 2014 20:48, Andy Seaborne a...@apache.org wrote:

Hi Adam,


On 16/03/14 18:58, Adam Retter wrote:


Hi there,

Firstly I would just like to say that whilst we have only been using
Elda and Fuseki for about a year, we have until now been really very
happy with them. Excellent stuff :-)



Which versions?


Fuseki 0.2.6, I did check the release notes and commit history for
newer versions, but did not see any detail of bug fixes that might
address this. Is it possible that it has been fixed and I just missed
it?


It's nothing to do with Fuseki, which is just the protocol handling. 
The stacktrace is in algebra generation because


{P1} UNION {P2} UNION {P3}
is

(union
  {P1}
  (union
{P2}
(union
  {P3}
 ...

It hasn't even got to the optimizer.

But then sending and parsing 1.4M queries is never going to be fast.


I will have to check the version of Elda tomorrow when I am in the
office and get back to you.


And is this using sub-SELECTs enabled in Elda?

IIRC that will replace the nearly 10,000 cases of UNION with the SELECT that
generated it in the first place, which is much (much) shorter.


I will check tomorrow and get back to you. If we are not doing that,
then that sounds very promising.





Unfortunately, today, we have a query that is generated by Elda and
POST'ed to Fuseki (https://github.com/epimorphics/elda). The query is
about 1.4MB!

Unfortunately this query causes Fuseki to throw a
java.lang.StackOverflowError. The only other post I found on the
mailing list which looks similar was from 2011
http://markmail.org/message/pwzdrcn7lnkqra35 but there was no follow
up to it.



Have you tried increasing the stack?  What happens?


I can certainly give that a try tomorrow as well. However, I normally
try to avoid doing this, and prefer to fix the problem at it's source,
otherwise I may just delays the inevitable to sometime in the future
when I run with a larger query ;-)


It is more information with which to debug.





Unfortunately, we really need to solve this issue quickly. I am not
opposed to getting my hands dirty in the Jena code base if someone can
tell me what needs to be done, and support me when I have questions.
But hopefully there is some sort of quick workaround? So then, what
are my options chaps?

You may access the stack trace here:
https://dl.dropboxusercontent.com/u/35135948/StackOverflowError.txt
and the query that caused the exception here:
https://dl.dropboxusercontent.com/u/35135948/fuseki-query.txt
Sorry for the use of DropBox, book the Apache mailing list manager
kept rejecting my post as it was too large otherwise.

Thanks Adam.



 Andy