[ 
https://issues.apache.org/jira/browse/HIVE-13730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15283217#comment-15283217
 ] 

Wei Zheng commented on HIVE-13730:
----------------------------------

As I dig deeper, it turns out that the issue is actually due to spilling the 
same hash partition twice.

In HybridHashTableContainer.internalPutRow, once isMemoryFull() returns true, 
we will pick the biggest partition in memory so far by calling 
biggestPartition(). This method is problematic.
{code}
  private int biggestPartition() {
    int res = 0;
    int maxSize = 0;

    // If a partition has been spilled to disk, its size will be 0, i.e. it 
won't be picked
    for (int i = 0; i < hashPartitions.length; i++) {
      int size;
      if (isOnDisk(i)) {
        continue;
      } else {
        size = hashPartitions[i].hashMap.getNumValues();
      }
      if (size > maxSize) {
        maxSize = size;
        res = i;
      }
    }
    return res;
  }
{code}

If all in-memory partitions have size 0, then the default initial value 0 will 
be returned. But what if partition 0 has already been spilled previously? This 
will spill partition 0 again, which is not expected.

> hybridgrace_hashjoin_1.q test gets stuck
> ----------------------------------------
>
>                 Key: HIVE-13730
>                 URL: https://issues.apache.org/jira/browse/HIVE-13730
>             Project: Hive
>          Issue Type: Bug
>          Components: Tez
>    Affects Versions: 2.1.0
>            Reporter: Vikram Dixit K
>            Assignee: Wei Zheng
>            Priority: Blocker
>         Attachments: HIVE-13730.1.patch, HIVE-13730.2.patch, 
> HIVE-13730.3.patch
>
>
> I am seeing hybridgrace_hashjoin_1.q getting stuck on master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to