[ 
https://issues.apache.org/jira/browse/ASTERIXDB-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15406254#comment-15406254
 ] 

Taewoo Kim commented on ASTERIXDB-1556:
---------------------------------------

(2) spilling the data frame is already implemented. We just don't have any 
within budget protection mechanism regarding the hash table size allocation and 
its overflow. We need to add two logics: #1. allocating a hash table frame from 
the global budget  #2. spilling whenever data or hash table can't allocate 
another frame. 

Regarding the time, I need to understand the current structure of hash table 
and data spilling logic so it may take a while. But, without apply this 
protection mechanism, we can't say that reducing the table size of external 
group-by solves the out of memory issue regarding the fuzzy join 100%. So, I 
will do my best to reduce the time.  

> Prefix-based multi-way Fuzzy-join generates an exception.
> ---------------------------------------------------------
>
>                 Key: ASTERIXDB-1556
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1556
>             Project: Apache AsterixDB
>          Issue Type: Bug
>            Reporter: Taewoo Kim
>            Assignee: Taewoo Kim
>         Attachments: 2wayjoin.pdf, 2wayjoin.rtf, 2wayjoinplan.rtf, 
> 3wayjoin.pdf, 3wayjoin.rtf, 3wayjoinplan.rtf
>
>
> When we enable prefix-based fuzzy-join and apply the multi-way fuzzy-join ( > 
> 2), the system generates an out-of-memory exception. 
> Since a fuzzy-join is created using 30-40 lines of AQL codes and this AQL is 
> translated into massive number of operators (more than 200 operators in the 
> plan for a 3-way fuzzy join), it could generate out-of-memory exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to