Github user jianqiao commented on a diff in the pull request:

    https://github.com/apache/incubator-quickstep/pull/332#discussion_r170369358
  
    --- Diff: query_optimizer/cost_model/StarSchemaSimpleCostModel.cpp ---
    @@ -493,7 +493,7 @@ std::size_t 
StarSchemaSimpleCostModel::getNumDistinctValues(
           return stat.getNumDistinctValues(rel_attr_id);
         }
       }
    -  return estimateCardinalityForTableReference(table_reference);
    +  return estimateCardinalityForTableReference(table_reference) * 0.1;
    --- End diff --
    
    This estimation ratio can be any decimal number that is not close to `1` -- 
in that case the optimizer would choose bad plans in some situations as the 
column appears to have "unique" values.
    
    `0.1` tends to be a reasonable choice -- we may also have `0.05`, `0.2`, 
etc., which can be adjusted later when there are actual demands.



---

Reply via email to