[ https://issues.apache.org/jira/browse/ASTERIXDB-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398207#comment-15398207 ]
Taewoo Kim commented on ASTERIXDB-1556: --------------------------------------- I agree that the join cost is more important. However, the complexity of query plan should be affordable. It's clear that the current multi-way fuzzy join is not doable in real-time in AsterixDB. We need find a way to resolve this. And we need to think about the join cost, too. So, as you said, how can we make the prefix-join runnable? Clearly, I think multi-way join should be optimized as I mentioned if we have three datasets - fuzzy-join on A and B first -> materialize -> then apply fuzzy-join (AB) and C. > Prefix-based multi-way Fuzzy-join generates an exception. > --------------------------------------------------------- > > Key: ASTERIXDB-1556 > URL: https://issues.apache.org/jira/browse/ASTERIXDB-1556 > Project: Apache AsterixDB > Issue Type: Bug > Reporter: Taewoo Kim > > When we enable prefix-based fuzzy-join and apply the multi-way fuzzy-join ( > > 2), the system generates an out-of-memory exception. > Since a fuzzy-join is created using 30-40 lines of AQL codes and this AQL is > translated into massive number of operators (more than 200 operators in the > plan for a 3-way fuzzy join), it could generate out-of-memory exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)