Recent SNAPSHOT builds of ARQ have started to include a pair of new optimizers 
(TransformFilterImplicitJoin and TransformImplicitLeftJoin) which aim to 
address performance deficiencies in a certain style of query.  In our testing 
we have seen impressive performance improvements for these kinds of queries 
(1-2 orders of magnitude)

These optimizations target queries of the following general forms:

1 – Implicit Join – Queries where a FILTER applies a ?x = ?y or SAMETERM(?x, 
?y) constraint e.g.

SELECT *
WHERE
{
  ?x ?p1 ?o1 .
  ?y ?p2 ?o2 .
  FILTER(?x = ?y)
}

2 – Implicit Left Join – Queries where a FILTER applies a ?x = ?y or 
SAMETERM(?x, ?y) over a left join e.g.

SELECT *
WHERE
{
  ?x ?p1 ?o1 .
  OPTIONAL
  {
    ?y ?p2 ?o2 .
  }
  FILTER(?x = ?y)
}

In both cases the optimization is applied only when considered safe and the 
optimizers are conservative and will not apply the optimizations when they 
would be unsafe.

While we have have already added many test cases to this effect we would 
appreciate if users who have workloads with these style of queries could run 
the latest SNAPSHOT against their queries to check that we are not applying the 
optimization in cases which are unsafe or have actually introduced performance 
regressions (e.g. due to the new optimizations blocking other optimizations).  
Reports of any queries that exhibit these issues would be appreciated, reports 
of improved performance would also provide useful validation of this work.

The work is ongoing and there are further cases we can optimize that we are not 
yet doing so expect further improvements in this area.

Thanks,

Rob

Reply via email to