[ https://issues.apache.org/jira/browse/ASTERIXDB-2253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16332785#comment-16332785 ]
Ildar Absalyamov commented on ASTERIXDB-2253: --------------------------------------------- [~wyk] , this plan is a result of rule DisjunctivePredicateToJoinRule being fired. But I agree in your example keeping just two selects will be a better alternative. I guess it falls down to a broader issue of nested loop vs. hash join, which depends on outer/inner cardinalities. An intermediate solution could be to use experiment and determine a threshold (in number of disjuncts) after which this rule is fired. > Disjunctive predicts on the same fields introduces join > ------------------------------------------------------- > > Key: ASTERIXDB-2253 > URL: https://issues.apache.org/jira/browse/ASTERIXDB-2253 > Project: Apache AsterixDB > Issue Type: Bug > Components: COMP - Compiler > Reporter: Wail Alkowaileet > Priority: Major > > I'm not sure if I'm missing something ... It looks more expensive than > StreamSelect > Query: > {noformat} > SELECT value t.text > FROM Tweets as t > WHERE t.retweet_count = 10 or t.retweet_count = 20{noformat} > Plan: > {noformat} > distribute result [$$16] > -- DISTRIBUTE_RESULT |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > project ([$$16]) > -- STREAM_PROJECT |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > join (eq($$19, $$17)) > -- HYBRID_HASH_JOIN [$$17][$$19] |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > project ([$$16, $$17]) > -- STREAM_PROJECT |PARTITIONED| > assign [$$16, $$17] <- [$$t.getField("text"), > $$t.getField("retweet_count")] > -- ASSIGN |PARTITIONED| > project ([$$t]) > -- STREAM_PROJECT |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > data-scan []<-[$$18, $$t] <- TwitterDataverse.Tweets > -- DATASOURCE_SCAN |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > empty-tuple-source > -- EMPTY_TUPLE_SOURCE |PARTITIONED| > exchange > -- BROADCAST_EXCHANGE |PARTITIONED| > unnest $$19 <- scan-collection(array: [ 20, 10 ]) > -- UNNEST |UNPARTITIONED| > empty-tuple-source > -- EMPTY_TUPLE_SOURCE |UNPARTITIONED| > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)