[ https://issues.apache.org/jira/browse/CALCITE-4351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Caizhi Weng updated CALCITE-4351: --------------------------------- Summary: The result of RelMdUtil#numDistinctVals will always be 0 when inputs are large (was: The result of RelMdUtil#numDistinctVals is incorrect when inputs are large) > The result of RelMdUtil#numDistinctVals will always be 0 when inputs are large > ------------------------------------------------------------------------------ > > Key: CALCITE-4351 > URL: https://issues.apache.org/jira/browse/CALCITE-4351 > Project: Calcite > Issue Type: Bug > Components: core > Affects Versions: 1.26.0 > Reporter: Caizhi Weng > Priority: Major > > Previous implementation of {{RelMdUtil#numDistinctVals}} uses the > approximation {{ln(1 + x) ~= x}} when {{x}} is small. > However CALCITE-4132 remove this approximation to make the result more > accurate. This causes the function to calculate an incorrect result for large > inputs (for example, when {{domainSize = 1e18}} and {{numSelected = 1e10}} > the result is 0) due to precision problems. > What I would suggest is to treat small and large inputs in different ways. > For small inputs we use the new, more precise function and for large inputs > we use the old, approximated function. -- This message was sent by Atlassian Jira (v8.3.4#803005)