[ https://issues.apache.org/jira/browse/SPARK-31148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tanapol Nearunchorn updated SPARK-31148: ---------------------------------------- Description: Given this query {code:java} select r.id from sqoop_wongnai.wongnai__w_ref r inner join sqoop_wongnai.wongnai__w_photo p on p.root_referrer_id = r.id inner join tests.restaurants_id_export_photos i on i.restaurant_id = r.restaurant_id where r.restaurant_id = 360 {code} Spark return nothing except when I switch inner join statement between line 3 and 4 there are results returned. I guarantee that we have data that match the join condition above. Here's result from explain extend of above query: {code:java} == Parsed Logical Plan == 'Project ['r.id] +- 'Filter ('r.restaurant_id = 360) +- 'Join Inner, ('i.restaurant_id = 'r.restaurant_id) :- 'Join Inner, ('p.root_referrer_id = 'r.id) : :- 'SubqueryAlias `r` : : +- 'UnresolvedRelation `sqoop_wongnai`.`wongnai__w_ref` : +- 'SubqueryAlias `p` : +- 'UnresolvedRelation `sqoop_wongnai`.`wongnai__w_photo` +- 'SubqueryAlias `i` +- 'UnresolvedRelation `tests`.`restaurants_id_export_photos` == Analyzed Logical Plan == id: bigint Project [id#1834834L] +- Filter (restaurant_id#1834836L = cast(360 as bigint)) +- Join Inner, (restaurant_id#1834876L = restaurant_id#1834836L) :- Join Inner, (root_referrer_id#1834858L = id#1834834L) : :- SubqueryAlias `r` : : +- SubqueryAlias `sqoop_wongnai`.`wongnai__w_ref` : : +- HiveTableRelation `sqoop_wongnai`.`wongnai__w_ref`, org.apache.hadoop.hive.serde2.avro.AvroSerDe, [id#1834834L, restaurant_id#1834836L,] : +- SubqueryAlias `p` : +- SubqueryAlias `sqoop_wongnai`.`wongnai__w_photo` : +- HiveTableRelation `sqoop_wongnai`.`wongnai__w_photo`, org.apache.hadoop.hive.serde2.avro.AvroSerDe, [root_referrer_id#1834858L, ... 14 more fields] +- SubqueryAlias `i` +- SubqueryAlias `tests`.`restaurants_id_export_photos` +- HiveTableRelation `tests`.`restaurants_id_export_photos`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [restaurant_id#1834876L]== Optimized Logical Plan == Project [id#1834834L] +- Join Inner, (restaurant_id#1834876L = restaurant_id#1834836L) :- Project [id#1834834L, restaurant_id#1834836L] : +- Join Inner, (root_referrer_id#1834858L = id#1834834L) : :- Project [id#1834834L, restaurant_id#1834836L] : : +- Filter ((isnotnull(restaurant_id#1834836L) && (restaurant_id#1834836L = 360)) && isnotnull(id#1834834L)) : : +- InMemoryRelation [id#1834834L, restaurant_id#1834836L], StorageLevel(disk, memory, deserialized, 1 replicas) : : +- Scan hive sqoop_wongnai.wongnai__w_ref [id#87357L, restaurant_id#87359L], HiveTableRelation `sqoop_wongnai`.`wongnai__w_ref`, org.apache.hadoop.hive.serde2.avro.AvroSerDe, [id#87357L, restaurant_id#87359L] : +- Project [root_referrer_id#1834858L] : +- Filter (isnotnull(root_referrer_id#1834858L) && bloomfilter#1835088 of [id#1834834L] filtering [root_referrer_id#1834858L]) : +- InMemoryRelation [root_referrer_id#1834858L], StorageLevel(disk, memory, deserialized, 1 replicas) : +- Scan hive sqoop_wongnai.wongnai__w_photo [root_referrer_id#82906L], HiveTableRelation `sqoop_wongnai`.`wongnai__w_photo`, org.apache.hadoop.hive.serde2.avro.AvroSerDe, [root_referrer_id#82906L] +- Filter ((isnotnull(restaurant_id#1834876L) && (restaurant_id#1834876L = 360)) && bloomfilter#1835087 of [restaurant_id#1834836L] filtering [restaurant_id#1834876L]) +- HiveTableRelation `tests`.`restaurants_id_export_photos`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [restaurant_id#1834876L] == Physical Plan == *(7) Project [id#1834834L] +- *(7) SortMergeJoin [restaurant_id#1834836L], [restaurant_id#1834876L], Inner :- *(5) Sort [restaurant_id#1834836L ASC NULLS FIRST], false, 0 : +- Exchange hashpartitioning(restaurant_id#1834836L, 200) : +- *(4) Project [id#1834834L, restaurant_id#1834836L] : +- *(4) SortMergeJoin [id#1834834L], [root_referrer_id#1834858L], Inner : :- *(2) Sort [id#1834834L ASC NULLS FIRST], false, 0 : : +- Exchange hashpartitioning(id#1834834L, 200) : : +- *(1) Filter ((isnotnull(restaurant_id#1834836L) && (restaurant_id#1834836L = 360)) && isnotnull(id#1834834L)) : : +- *(1) InMemoryTableScan [id#1834834L, restaurant_id#1834836L], [isnotnull(restaurant_id#1834836L), (restaurant_id#1834836L = 360), isnotnull(id#1834834L)] : : +- InMemoryRelation [id#1834834L, restaurant_id#1834836L], StorageLevel(disk, memory, deserialized, 1 replicas) : : +- Scan hive sqoop_wongnai.wongnai__w_ref [id#87357L, restaurant_id#87359L], HiveTableRelation `sqoop_wongnai`.`wongnai__w_ref`, org.apache.hadoop.hive.serde2.avro.AvroSerDe, [id#87357L, restaurant_id#87359L] : +- *(3) Sort [root_referrer_id#1834858L ASC NULLS FIRST], false, 0 : +- Exchange hashpartitioning(root_referrer_id#1834858L, 200) : +- Filter (isnotnull(root_referrer_id#1834858L) && bloomfilter#1835088 of [bf1835088 id#1834834L estimatedNumRows=1457534] filtering [root_referrer_id#1834858L]) : : +- GenerateBloomFilter : : +- ReusedExchange [id#1834834L, restaurant_id#1834836L], Exchange hashpartitioning(id#1834834L, 200) : +- InMemoryTableScan [root_referrer_id#1834858L], [isnotnull(root_referrer_id#1834858L), bloomfilter#1835088 of [bf1835088 id#1834834L estimatedNumRows=1457534] filtering [root_referrer_id#1834858L]] : :- InMemoryRelation [root_referrer_id#1834858L], StorageLevel(disk, memory, deserialized, 1 replicas) : : +- Scan hive sqoop_wongnai.wongnai__w_photo [root_referrer_id#82906L], HiveTableRelation `sqoop_wongnai`.`wongnai__w_photo`, org.apache.hadoop.hive.serde2.avro.AvroSerDe, [root_referrer_id#82906L] : +- GenerateBloomFilter : +- ReusedExchange [id#1834834L, restaurant_id#1834836L], Exchange hashpartitioning(id#1834834L, 200) +- *(6) Sort [restaurant_id#1834876L ASC NULLS FIRST], false, 0 +- Exchange hashpartitioning(restaurant_id#1834876L, 200) +- Filter ((isnotnull(restaurant_id#1834876L) && (restaurant_id#1834876L = 360)) && bloomfilter#1835087 of [bf1835088 id#1834834L estimatedNumRows=1457534] filtering [restaurant_id#1834876L]) : +- GenerateBloomFilter : +- ReusedExchange [id#1834834L, restaurant_id#1834836L], Exchange hashpartitioning(id#1834834L, 200) +- Scan hive tests.restaurants_id_export_photos [restaurant_id#1834876L], HiveTableRelation `tests`.`restaurants_id_export_photos`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [restaurant_id#1834876L] {code} I noticed that in physical plan {code:java} bloomfilter#1835087 of [bf1835088 id#1834834L estimatedNumRows=1457534] filtering [restaurant_id#1834876L] {code} it use bf1835088 id#1834834L for filtering restaurant_id#1834876L and which is different from join condition. I think that the correct Physical query plan should use bf1835087 restaurant_id#1834836L for filtering restaurant_id#1834876L. You can see full of query plan of both incorrect and correct in this link: [https://gist.github.com/tanapoln/67a335858dede5eddd9d80659de60d72] Please help. was: Given this query {code:java} select r.id from sqoop_wongnai.wongnai__w_ref r inner join sqoop_wongnai.wongnai__w_photo p on p.root_referrer_id = r.id inner join tests.restaurants_id_export_photos i on i.restaurant_id = r.restaurant_id where r.restaurant_id = 360 {code} Spark return nothing except when I switch inner join statement between line 3 and 4 there are results returned. I guarantee that we have data that match the join condition above. Here's result from explain extend of above query: {code:java} == Parsed Logical Plan == 'Project ['r.id] +- 'Filter ('r.restaurant_id = 360) +- 'Join Inner, ('i.restaurant_id = 'r.restaurant_id) :- 'Join Inner, ('p.root_referrer_id = 'r.id) : :- 'SubqueryAlias `r` : : +- 'UnresolvedRelation `sqoop_wongnai`.`wongnai__w_ref` : +- 'SubqueryAlias `p` : +- 'UnresolvedRelation `sqoop_wongnai`.`wongnai__w_photo` +- 'SubqueryAlias `i` +- 'UnresolvedRelation `tests`.`restaurants_id_export_photos`== Analyzed Logical Plan == id: bigint Project [id#1834834L] +- Filter (restaurant_id#1834836L = cast(360 as bigint)) +- Join Inner, (restaurant_id#1834876L = restaurant_id#1834836L) :- Join Inner, (root_referrer_id#1834858L = id#1834834L) : :- SubqueryAlias `r` : : +- SubqueryAlias `sqoop_wongnai`.`wongnai__w_ref` : : +- HiveTableRelation `sqoop_wongnai`.`wongnai__w_ref`, org.apache.hadoop.hive.serde2.avro.AvroSerDe, [id#1834834L, restaurant_id#1834836L,] : +- SubqueryAlias `p` : +- SubqueryAlias `sqoop_wongnai`.`wongnai__w_photo` : +- HiveTableRelation `sqoop_wongnai`.`wongnai__w_photo`, org.apache.hadoop.hive.serde2.avro.AvroSerDe, [root_referrer_id#1834858L, ... 14 more fields] +- SubqueryAlias `i` +- SubqueryAlias `tests`.`restaurants_id_export_photos` +- HiveTableRelation `tests`.`restaurants_id_export_photos`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [restaurant_id#1834876L]== Optimized Logical Plan == Project [id#1834834L] +- Join Inner, (restaurant_id#1834876L = restaurant_id#1834836L) :- Project [id#1834834L, restaurant_id#1834836L] : +- Join Inner, (root_referrer_id#1834858L = id#1834834L) : :- Project [id#1834834L, restaurant_id#1834836L] : : +- Filter ((isnotnull(restaurant_id#1834836L) && (restaurant_id#1834836L = 360)) && isnotnull(id#1834834L)) : : +- InMemoryRelation [id#1834834L, restaurant_id#1834836L], StorageLevel(disk, memory, deserialized, 1 replicas) : : +- Scan hive sqoop_wongnai.wongnai__w_ref [id#87357L, restaurant_id#87359L], HiveTableRelation `sqoop_wongnai`.`wongnai__w_ref`, org.apache.hadoop.hive.serde2.avro.AvroSerDe, [id#87357L, restaurant_id#87359L] : +- Project [root_referrer_id#1834858L] : +- Filter (isnotnull(root_referrer_id#1834858L) && bloomfilter#1835088 of [id#1834834L] filtering [root_referrer_id#1834858L]) : +- InMemoryRelation [root_referrer_id#1834858L], StorageLevel(disk, memory, deserialized, 1 replicas) : +- Scan hive sqoop_wongnai.wongnai__w_photo [root_referrer_id#82906L], HiveTableRelation `sqoop_wongnai`.`wongnai__w_photo`, org.apache.hadoop.hive.serde2.avro.AvroSerDe, [root_referrer_id#82906L] +- Filter ((isnotnull(restaurant_id#1834876L) && (restaurant_id#1834876L = 360)) && bloomfilter#1835087 of [restaurant_id#1834836L] filtering [restaurant_id#1834876L]) +- HiveTableRelation `tests`.`restaurants_id_export_photos`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [restaurant_id#1834876L]== Physical Plan == *(7) Project [id#1834834L] +- *(7) SortMergeJoin [restaurant_id#1834836L], [restaurant_id#1834876L], Inner :- *(5) Sort [restaurant_id#1834836L ASC NULLS FIRST], false, 0 : +- Exchange hashpartitioning(restaurant_id#1834836L, 200) : +- *(4) Project [id#1834834L, restaurant_id#1834836L] : +- *(4) SortMergeJoin [id#1834834L], [root_referrer_id#1834858L], Inner : :- *(2) Sort [id#1834834L ASC NULLS FIRST], false, 0 : : +- Exchange hashpartitioning(id#1834834L, 200) : : +- *(1) Filter ((isnotnull(restaurant_id#1834836L) && (restaurant_id#1834836L = 360)) && isnotnull(id#1834834L)) : : +- *(1) InMemoryTableScan [id#1834834L, restaurant_id#1834836L], [isnotnull(restaurant_id#1834836L), (restaurant_id#1834836L = 360), isnotnull(id#1834834L)] : : +- InMemoryRelation [id#1834834L, restaurant_id#1834836L], StorageLevel(disk, memory, deserialized, 1 replicas) : : +- Scan hive sqoop_wongnai.wongnai__w_ref [id#87357L, restaurant_id#87359L], HiveTableRelation `sqoop_wongnai`.`wongnai__w_ref`, org.apache.hadoop.hive.serde2.avro.AvroSerDe, [id#87357L, restaurant_id#87359L] : +- *(3) Sort [root_referrer_id#1834858L ASC NULLS FIRST], false, 0 : +- Exchange hashpartitioning(root_referrer_id#1834858L, 200) : +- Filter (isnotnull(root_referrer_id#1834858L) && bloomfilter#1835088 of [bf1835088 id#1834834L estimatedNumRows=1457534] filtering [root_referrer_id#1834858L]) : : +- GenerateBloomFilter : : +- ReusedExchange [id#1834834L, restaurant_id#1834836L], Exchange hashpartitioning(id#1834834L, 200) : +- InMemoryTableScan [root_referrer_id#1834858L], [isnotnull(root_referrer_id#1834858L), bloomfilter#1835088 of [bf1835088 id#1834834L estimatedNumRows=1457534] filtering [root_referrer_id#1834858L]] : :- InMemoryRelation [root_referrer_id#1834858L], StorageLevel(disk, memory, deserialized, 1 replicas) : : +- Scan hive sqoop_wongnai.wongnai__w_photo [root_referrer_id#82906L], HiveTableRelation `sqoop_wongnai`.`wongnai__w_photo`, org.apache.hadoop.hive.serde2.avro.AvroSerDe, [root_referrer_id#82906L] : +- GenerateBloomFilter : +- ReusedExchange [id#1834834L, restaurant_id#1834836L], Exchange hashpartitioning(id#1834834L, 200) +- *(6) Sort [restaurant_id#1834876L ASC NULLS FIRST], false, 0 +- Exchange hashpartitioning(restaurant_id#1834876L, 200) +- Filter ((isnotnull(restaurant_id#1834876L) && (restaurant_id#1834876L = 360)) && bloomfilter#1835087 of [bf1835088 id#1834834L estimatedNumRows=1457534] filtering [restaurant_id#1834876L]) : +- GenerateBloomFilter : +- ReusedExchange [id#1834834L, restaurant_id#1834836L], Exchange hashpartitioning(id#1834834L, 200) +- Scan hive tests.restaurants_id_export_photos [restaurant_id#1834876L], HiveTableRelation `tests`.`restaurants_id_export_photos`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [restaurant_id#1834876L] {code} I noticed that in physical plan {code:java} bloomfilter#1835087 of [bf1835088 id#1834834L estimatedNumRows=1457534] filtering [restaurant_id#1834876L] {code} it use bf1835088 id#1834834L for filtering restaurant_id#1834876L and which is different from join condition. I think that the correct Physical query plan should use bf1835087 restaurant_id#1834836L for filtering restaurant_id#1834876L. You can see full of query plan of both incorrect and correct in this link: [https://gist.github.com/tanapoln/67a335858dede5eddd9d80659de60d72] Please help. > Physical Plan and Optimized Logical Plan are mismatch > ----------------------------------------------------- > > Key: SPARK-31148 > URL: https://issues.apache.org/jira/browse/SPARK-31148 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.4.4 > Reporter: Tanapol Nearunchorn > Priority: Major > > Given this query > > {code:java} > select r.id > from sqoop_wongnai.wongnai__w_ref r > inner join sqoop_wongnai.wongnai__w_photo p on p.root_referrer_id = r.id > inner join tests.restaurants_id_export_photos i on i.restaurant_id = > r.restaurant_id > where r.restaurant_id = 360 > {code} > Spark return nothing except when I switch inner join statement between line 3 > and 4 there are results returned. > I guarantee that we have data that match the join condition above. > Here's result from explain extend of above query: > > {code:java} > == Parsed Logical Plan == > 'Project ['r.id] > +- 'Filter ('r.restaurant_id = 360) > +- 'Join Inner, ('i.restaurant_id = 'r.restaurant_id) > :- 'Join Inner, ('p.root_referrer_id = 'r.id) > : :- 'SubqueryAlias `r` > : : +- 'UnresolvedRelation `sqoop_wongnai`.`wongnai__w_ref` > : +- 'SubqueryAlias `p` > : +- 'UnresolvedRelation `sqoop_wongnai`.`wongnai__w_photo` > +- 'SubqueryAlias `i` > +- 'UnresolvedRelation `tests`.`restaurants_id_export_photos` > == Analyzed Logical Plan == > id: bigint > Project [id#1834834L] > +- Filter (restaurant_id#1834836L = cast(360 as bigint)) > +- Join Inner, (restaurant_id#1834876L = restaurant_id#1834836L) > :- Join Inner, (root_referrer_id#1834858L = id#1834834L) > : :- SubqueryAlias `r` > : : +- SubqueryAlias `sqoop_wongnai`.`wongnai__w_ref` > : : +- HiveTableRelation `sqoop_wongnai`.`wongnai__w_ref`, > org.apache.hadoop.hive.serde2.avro.AvroSerDe, [id#1834834L, > restaurant_id#1834836L,] > : +- SubqueryAlias `p` > : +- SubqueryAlias `sqoop_wongnai`.`wongnai__w_photo` > : +- HiveTableRelation `sqoop_wongnai`.`wongnai__w_photo`, > org.apache.hadoop.hive.serde2.avro.AvroSerDe, [root_referrer_id#1834858L, ... > 14 more fields] > +- SubqueryAlias `i` > +- SubqueryAlias `tests`.`restaurants_id_export_photos` > +- HiveTableRelation `tests`.`restaurants_id_export_photos`, > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, > [restaurant_id#1834876L]== Optimized Logical Plan == > Project [id#1834834L] > +- Join Inner, (restaurant_id#1834876L = restaurant_id#1834836L) > :- Project [id#1834834L, restaurant_id#1834836L] > : +- Join Inner, (root_referrer_id#1834858L = id#1834834L) > : :- Project [id#1834834L, restaurant_id#1834836L] > : : +- Filter ((isnotnull(restaurant_id#1834836L) && > (restaurant_id#1834836L = 360)) && isnotnull(id#1834834L)) > : : +- InMemoryRelation [id#1834834L, restaurant_id#1834836L], > StorageLevel(disk, memory, deserialized, 1 replicas) > : : +- Scan hive sqoop_wongnai.wongnai__w_ref [id#87357L, > restaurant_id#87359L], HiveTableRelation `sqoop_wongnai`.`wongnai__w_ref`, > org.apache.hadoop.hive.serde2.avro.AvroSerDe, [id#87357L, > restaurant_id#87359L] > : +- Project [root_referrer_id#1834858L] > : +- Filter (isnotnull(root_referrer_id#1834858L) && > bloomfilter#1835088 of [id#1834834L] filtering [root_referrer_id#1834858L]) > : +- InMemoryRelation [root_referrer_id#1834858L], > StorageLevel(disk, memory, deserialized, 1 replicas) > : +- Scan hive sqoop_wongnai.wongnai__w_photo > [root_referrer_id#82906L], HiveTableRelation > `sqoop_wongnai`.`wongnai__w_photo`, > org.apache.hadoop.hive.serde2.avro.AvroSerDe, [root_referrer_id#82906L] > +- Filter ((isnotnull(restaurant_id#1834876L) && (restaurant_id#1834876L = > 360)) && bloomfilter#1835087 of [restaurant_id#1834836L] filtering > [restaurant_id#1834876L]) > +- HiveTableRelation `tests`.`restaurants_id_export_photos`, > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [restaurant_id#1834876L] > == Physical Plan == > *(7) Project [id#1834834L] > +- *(7) SortMergeJoin [restaurant_id#1834836L], [restaurant_id#1834876L], > Inner > :- *(5) Sort [restaurant_id#1834836L ASC NULLS FIRST], false, 0 > : +- Exchange hashpartitioning(restaurant_id#1834836L, 200) > : +- *(4) Project [id#1834834L, restaurant_id#1834836L] > : +- *(4) SortMergeJoin [id#1834834L], [root_referrer_id#1834858L], > Inner > : :- *(2) Sort [id#1834834L ASC NULLS FIRST], false, 0 > : : +- Exchange hashpartitioning(id#1834834L, 200) > : : +- *(1) Filter ((isnotnull(restaurant_id#1834836L) && > (restaurant_id#1834836L = 360)) && isnotnull(id#1834834L)) > : : +- *(1) InMemoryTableScan [id#1834834L, > restaurant_id#1834836L], [isnotnull(restaurant_id#1834836L), > (restaurant_id#1834836L = 360), isnotnull(id#1834834L)] > : : +- InMemoryRelation [id#1834834L, > restaurant_id#1834836L], StorageLevel(disk, memory, deserialized, 1 replicas) > : : +- Scan hive sqoop_wongnai.wongnai__w_ref > [id#87357L, restaurant_id#87359L], HiveTableRelation > `sqoop_wongnai`.`wongnai__w_ref`, > org.apache.hadoop.hive.serde2.avro.AvroSerDe, [id#87357L, > restaurant_id#87359L] > : +- *(3) Sort [root_referrer_id#1834858L ASC NULLS FIRST], > false, 0 > : +- Exchange hashpartitioning(root_referrer_id#1834858L, 200) > : +- Filter (isnotnull(root_referrer_id#1834858L) && > bloomfilter#1835088 of [bf1835088 id#1834834L estimatedNumRows=1457534] > filtering [root_referrer_id#1834858L]) > : : +- GenerateBloomFilter > : : +- ReusedExchange [id#1834834L, > restaurant_id#1834836L], Exchange hashpartitioning(id#1834834L, 200) > : +- InMemoryTableScan [root_referrer_id#1834858L], > [isnotnull(root_referrer_id#1834858L), bloomfilter#1835088 of [bf1835088 > id#1834834L estimatedNumRows=1457534] filtering [root_referrer_id#1834858L]] > : :- InMemoryRelation > [root_referrer_id#1834858L], StorageLevel(disk, memory, deserialized, 1 > replicas) > : : +- Scan hive > sqoop_wongnai.wongnai__w_photo [root_referrer_id#82906L], HiveTableRelation > `sqoop_wongnai`.`wongnai__w_photo`, > org.apache.hadoop.hive.serde2.avro.AvroSerDe, [root_referrer_id#82906L] > : +- GenerateBloomFilter > : +- ReusedExchange [id#1834834L, > restaurant_id#1834836L], Exchange hashpartitioning(id#1834834L, 200) > +- *(6) Sort [restaurant_id#1834876L ASC NULLS FIRST], false, 0 > +- Exchange hashpartitioning(restaurant_id#1834876L, 200) > +- Filter ((isnotnull(restaurant_id#1834876L) && > (restaurant_id#1834876L = 360)) && bloomfilter#1835087 of [bf1835088 > id#1834834L estimatedNumRows=1457534] filtering [restaurant_id#1834876L]) > : +- GenerateBloomFilter > : +- ReusedExchange [id#1834834L, restaurant_id#1834836L], > Exchange hashpartitioning(id#1834834L, 200) > +- Scan hive tests.restaurants_id_export_photos > [restaurant_id#1834876L], HiveTableRelation > `tests`.`restaurants_id_export_photos`, > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [restaurant_id#1834876L] > {code} > > > > I noticed that in physical plan > > {code:java} > bloomfilter#1835087 of [bf1835088 id#1834834L estimatedNumRows=1457534] > filtering [restaurant_id#1834876L] > {code} > it use bf1835088 id#1834834L for filtering restaurant_id#1834876L and which > is different from join condition. > I think that the correct Physical query plan should use bf1835087 > restaurant_id#1834836L for filtering restaurant_id#1834876L. > > You can see full of query plan of both incorrect and correct in this link: > [https://gist.github.com/tanapoln/67a335858dede5eddd9d80659de60d72] > > Please help. > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org