[jira] Commented: (DERBY-634) Subquery materialization can cause stack overflow

Satheesh Bandaram (JIRA) Wed, 16 Aug 2006 09:58:39 -0700

    [ 
http://issues.apache.org/jira/browse/DERBY-634?page=comments#action_12428436 ] 
            
Satheesh Bandaram commented on DERBY-634:
-----------------------------------------


Kathey, here are some answers...

1) No... There was a specific bug in the original optimization that could cause 
stack overflow. That bug has been correctly fixed. There should be no cases of 
stack overflow now.
2) This is a comprehensive fix to the stack overflow problem. I was only saying 
generalizing the optimization that was implemented long time ago to be based on 
new subquery materialization framework would improve performance for more 
classes of queries. This would be an improvement request, like we already have 
several optimizer/preprocess improvements. Army and myself compiled about 10 
known optimizations that could be implemented in Derby. You can say this would 
be another.. But this has nothing to do with stack overflow problem itself.

If this is not clear, let me know. I can provide more info.


> Subquery materialization can cause stack overflow
> -------------------------------------------------
>
>                 Key: DERBY-634
>                 URL: http://issues.apache.org/jira/browse/DERBY-634
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.1.1.1
>            Reporter: Jeff Lichtman
>             Fix For: 10.2.0.0
>
>         Attachments: DERBY-634.diff
>
>
> A performance optimization in subquery processing can cause a stack overflow.
> The optimization materializes a subquery ResultSet in memory where it thinks 
> the rows will fit in memory. The materialization is done as a  set of  nested 
> unions of constant rows (UnionResultSets and RowResultSets). If there are a 
> lot of rows this can cause a stack overflow when fetching a row.
> The obvious fix is to make it use an iterative technique rather than a 
> recursive one for storing and returning the rows. See the method 
> BaseActivation.materializeResultSetIfPossible() in the language execution 
> code.
> There are some other issues with this performance optimization that should be 
> looked at:
> 1) The optimization can backfire, making the query run much slower. For 
> example, in the query:
>     select * from one_row_table where column1 not in
>         (select column2 from million_row_table)
> reading million_row_table into memory is an expensive operation. If there is 
> an index on million_row_table.column2, the query should return a result very 
> quickly despite the large size of million_row_table by doing a single probe 
> into million_row_table via the index.
> Since in-memory materialization can be an expensive operation, the decision 
> about whether to do it should be made based on query optimizer cost 
> estimates. See SubqueryNode.generateExpression().
> 2) It may not be wise to cache partial query results in memory at all. 
> Although this can help performance in some cases, it also chews up memory. 
> This is different from a limited-size cache with a backing store (like what 
> the store uses for page caching). The language has no way to limit the total 
> amount of memory used in this type of processing. Note that hash joins 
> originally used in-memory hash tables with no backing store, and that a 
> backing store was added later.
> 3) The implementation of this optimization has some problems. The decision to 
> materialize the subquery results in memory is made during code generation - 
> all such decisions should be made during the compilation phase. It's not 
> clear to me why materializeResultSetIfPossible() is in BaseActivation - I 
> would expect the of materialization to be done by a type of ResultSet, not by 
> a method in BaseActivation. Also, this method calls getMaxMemoryPerTable() in 
> the OptimizerFactory - nothing in the execution code should refer to anything 
> in the compilation code (perhaps getMaxMemoryPerTable() should be moved 
> somewhere else).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-634) Subquery materialization can cause stack overflow

Reply via email to