[ https://issues.apache.org/jira/browse/ASTERIXDB-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15406254#comment-15406254 ]
Taewoo Kim commented on ASTERIXDB-1556: --------------------------------------- (2) spilling the data frame is already implemented. We just don't have any within budget protection mechanism regarding the hash table size allocation and its overflow. We need to add two logics: #1. allocating a hash table frame from the global budget #2. spilling whenever data or hash table can't allocate another frame. Regarding the time, I need to understand the current structure of hash table and data spilling logic so it may take a while. But, without apply this protection mechanism, we can't say that reducing the table size of external group-by solves the out of memory issue regarding the fuzzy join 100%. So, I will do my best to reduce the time. > Prefix-based multi-way Fuzzy-join generates an exception. > --------------------------------------------------------- > > Key: ASTERIXDB-1556 > URL: https://issues.apache.org/jira/browse/ASTERIXDB-1556 > Project: Apache AsterixDB > Issue Type: Bug > Reporter: Taewoo Kim > Assignee: Taewoo Kim > Attachments: 2wayjoin.pdf, 2wayjoin.rtf, 2wayjoinplan.rtf, > 3wayjoin.pdf, 3wayjoin.rtf, 3wayjoinplan.rtf > > > When we enable prefix-based fuzzy-join and apply the multi-way fuzzy-join ( > > 2), the system generates an out-of-memory exception. > Since a fuzzy-join is created using 30-40 lines of AQL codes and this AQL is > translated into massive number of operators (more than 200 operators in the > plan for a 3-way fuzzy join), it could generate out-of-memory exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)