GitHub user jianqiao opened a pull request:

    https://github.com/apache/incubator-quickstep/pull/333

    Fix SeparateChainingHashTable::resize()

    This PR fixes the problem that Quickstep hangs when resizing 
`SeparateChainingHashTable` during the execution of `BuildHashOperator`.
    
    Here is a sequence of queries that reproduce the problem:
    ```
    CREATE TABLE r(x INT, y INT);
    CREATE TABLE s(x INT, y INT);
    CREATE TABLE t(x INT, y INT);
    
    INSERT INTO r SELECT 1, 1 FROM generate_series(1, 200) AS g(x);
    INSERT INTO s SELECT 1, 1 FROM generate_series(1, 200) AS g(x);
    INSERT INTO t SELECT 1, 1 FROM generate_series(1, 1000) AS g(x);
    
    \analyze
    
    SELECT COUNT(*) FROM r, s, t WHERE r.x = s.x AND r.y = s.y AND s.x = t.x 
AND s.y = t.y;
    ```
    
    The problem is caused by the [`resize()` 
call](https://github.com/apache/incubator-quickstep/blob/master/storage/HashTable.hpp#L1514)
 in `HashTable::putValueAccessorCompositeKey()` when `using_prealloc` is true. 
In this case, pre-allocation decides to resize the hash table in order to 
consume all the tuples from the current value accessor. However, `resize()` 
will alway abort if the hash table is not "actually full", causing infinite 
loops.
    
    Note that `SimpleScalarSeparateChainingHashTable` does not have the same 
problem, as its [`isFull` 
method](https://github.com/apache/incubator-quickstep/blob/master/storage/SimpleScalarSeparateChainingHashTable.hpp#L241)
 already takes `extra_buckets` into consideration.
    
    Also note that `LinearOpenAddressingHashTable` seems to have avoided the 
hanging problem by using a [`retry_num` 
check](https://github.com/apache/incubator-quickstep/blob/master/storage/LinearOpenAddressingHashTable.hpp#L1203).

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jianqiao/incubator-quickstep fix-hash-resize

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-quickstep/pull/333.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #333
    
----
commit d1dbb0d9bc2d1f001deee4039157b0be464870f4
Author: Jianqiao Zhu <jianqiao@...>
Date:   2018-02-18T07:16:07Z

    Fix the hanging problem of SeparateChainingHashTable::resize()

----


---

Reply via email to