[ 
https://issues.apache.org/jira/browse/JENA-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15088181#comment-15088181
 ] 

Andy Seaborne edited comment on JENA-1113 at 1/8/16 1:38 PM:
-------------------------------------------------------------

The value of expanding a large expression for the rest of the optimizer to be 
able to pick it apart is going to be pretty minimal for very long {{IN}} lists. 
 

The optimizer does optimize {{IN}} via its expansion: e.g.
{noformat}
{?s ?p ?o . FILTER ( ?o IN (:C1, :C2 ) ) }
{noformat}
{noformat}
{?s ?p ?o . FILTER ( ?o = :C1 || ?o = :C2 )
{noformat}
{noformat}
{?s ?p :C1 . BIND(:C1 as ?o) }
  UNION
{?s ?p :C2 . BIND(:C2 as ?o) }
{noformat}

The expansion should probably have an upper limit applied.


was (Author: andy.seaborne):
The value of expanding a large expression for the rest of the optimizer to be 
able to pick it apart is going to be pretty minimal for very long {{IN}} lists. 
 

The optimize does optimize e.g.
{noformat}
{?s ?p ?o . FILTER ( ?o = :C1 || ?o = :C2 )
{noformat}
into
{noformat}
      {?s ?p :C1 BIND(:C1 as ?o) }
UNION
      {?s ?p :C2 BIND(:C2 as ?o) }
{noformat}

The expansion should probably have an upper limit applied.

> FILTER(?x IN ( .... )) causes stackoverflow for very long lists of values.
> --------------------------------------------------------------------------
>
>                 Key: JENA-1113
>                 URL: https://issues.apache.org/jira/browse/JENA-1113
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: ARQ
>    Affects Versions: Jena 3.0.1
>            Reporter: Andy Seaborne
>            Assignee: Andy Seaborne
>            Priority: Minor
>
> {{FILTER(?x IN ( .... ))}} with a list of 60,000+ URIs causes a stack 
> overflow during execution. The list was machine generated.
> {{?x IN(C1, C2, ...)}} is expanded to {{?x = C1 || ?x = C2 ||  .... }}. In 
> many places, working with expressions is a tree walk ({{ExprWalker}}).
> The transformation is a left-deep tree: for a 4-long list:
> {noformat}
> (||
>  (||
>   (|| (= ?x C1)
>       (= ?x C2))
>   (= ?x C3))
>  (= ?x C4))
> {noformat}
> so the walk is as deep as the list is long.  A transformation rewrite is 3 
> stack frames per level.  The result is a huge number of stack frames until is 
> exceeds the stack space available.
> A right-deep list would be better but only if all processing code is aware of 
> this and does a loop for the right recursion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to