kangkaisen created KYLIN-1694: --------------------------------- Summary: make multiply coefficient configurable when estimating cuboid size Key: KYLIN-1694 URL: https://issues.apache.org/jira/browse/KYLIN-1694 Project: Kylin Issue Type: Bug Components: Job Engine Affects Versions: v1.5.1, v1.5.0 Reporter: kangkaisen Assignee: Dong Li
In the current version of MRv2 build engine, in CubeStatsReader when estimating cuboid size , the curent method is "cube is memory hungry, storage size estimation multiply 0.05" and "cube is not memory hungry, storage size estimation multiply 0.25". This has one major problems:the default multiply coefficient is smaller, this will make the estimated cuboid size much less than the actual cuboid size,which will lead to the region numbers of HBase and the reducer numbers of CubeHFileJob are both smaller. obviously, the current method makes the job of CubeHFileJob much slower. After we remove the the default multiply coefficient, the job of CubeHFileJob becomes much faster. we'd better make multiply coefficient configurable and this could be more friendly for user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)