> Caused by: > org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionError: > VectorMapJoin Hash table loading exceeded memory limits. > estimatedMemoryUsage: 1644167752 noconditionalTaskSize: 463667612 > inflationFactor: 2.0 threshold: 927335232 effectiveThreshold: 927335232
Most likely the table does not have column statistics to allow for it to estimate join sizes correctly and run through hive's CBO. Check if the explain plan says "Optimized by CBO". Also, check if it says in_bloom_filter() on the store_sales scanner, because if the COLUMN STATS: COMPLETE is missing the bloom filters get disabled because they can't be sized from the row-counts. > query25 against a 25GB dataset (my instance memory size is 64GB) This is an artificial error, which is setup so that no single query can overload a daemon. With a single node + single query setup, you probably can just disable the checking. set hive.llap.mapjoin.memory.monitor.check.interval=0; Cheers, Gopal
