Sounds like a YARN configuration problem
Parallelize is good :), not all Map / reduces are executed at same times
Check some configurations like:


   yarn.nodemanager.resource.memory-mb per node

   yarn.nodemanager.resource.cpu-vcores per node

This can help you to start:

If your cluster is very small, put block size to 256 MB can be too big, you
can try with 128 MB

On 27 May 2017 at 08:49, jianhui.yi <> wrote:

> My model have 7 tables,a cube have 15 dimensions, in the “Convert Cuboid
> Data to HFile” step to start too many maps and reduces(maps 500+,reduces
> 1.4k+),This step expend all resources of the small cluster.
> I set these parameters in the cluster:
> dfs.block.size=256M
> hive.exec.reducers.bytes.per.reducer=1073741824
> hive.merge.mapfiles=true
> hive.merge.mapredfiles=true
> hive.merge.size.per.task=256M
> kylin_hive_conf.xml this file uses the default settings
> Where can I turning performance optimization?
> Thanks.

