[ 
https://issues.apache.org/jira/browse/ASTERIXDB-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15645756#comment-15645756
 ] 

Taewoo Kim commented on ASTERIXDB-1556:
---------------------------------------

I'm now working on this idea: Let the D and H share the buffer pool (the same 
to [~che...@gmail.com]'s idea).

It turns out that allocating frames from the same buffer pool is a good thing 
that we don't need to check H(hash table byte size) + D(data table byte size) > 
M(memory budget). But, it now generates a different issue - atomicity. 
Currently, we don't have this issue because data table and hash table get the 
memory frames from the different source. That is, they never fail. However, 
when we insert a tuple from an incoming frame, we 1) insert a tuple to data 
table and get the tuple pointer(frame index, offset). Then 2) insert this tuple 
pointer to the hash table. At this time, there might not be enough memory so 
that the operation 1) succeeds and 2) might fail. In this case, we need to 
revert the effect of 1). Also, 2) consists of multiple steps in case if a hash 
slot needs to be migrated due to slot overflow, partial steps done in 2) need 
to be reversed, too. 

> Hash Table used by External hash group-by doesn't conform to the budget.
> ------------------------------------------------------------------------
>
>                 Key: ASTERIXDB-1556
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1556
>             Project: Apache AsterixDB
>          Issue Type: Bug
>            Reporter: Taewoo Kim
>            Assignee: Taewoo Kim
>            Priority: Critical
>              Labels: soon
>         Attachments: 2wayjoin.pdf, 2wayjoin.rtf, 2wayjoinplan.rtf, 
> 3wayjoin.pdf, 3wayjoin.rtf, 3wayjoinplan.rtf
>
>
> When we enable prefix-based fuzzy-join and apply the multi-way fuzzy-join ( > 
> 2), the system generates an out-of-memory exception. 
> Since a fuzzy-join is created using 30-40 lines of AQL codes and this AQL is 
> translated into massive number of operators (more than 200 operators in the 
> plan for a 3-way fuzzy join), it could generate out-of-memory exception.
> /// Update: as the discussion goes, we found that hash table in the external 
> hash group by doesn't conform to the frame limit. So, an out of memory 
> exception happens during the execution of an external hash group by operator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to