[ https://issues.apache.org/jira/browse/KYLIN-4833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiaoxiang Yu closed KYLIN-4833. ------------------------------- Released at kylin 3.1.2 > use distcp to control the speed of writting hfile data to hbase cluster > ----------------------------------------------------------------------- > > Key: KYLIN-4833 > URL: https://issues.apache.org/jira/browse/KYLIN-4833 > Project: Kylin > Issue Type: Improvement > Components: Storage - HBase > Affects Versions: v3.1.1 > Reporter: fengpod > Assignee: fengpod > Priority: Minor > Fix For: v3.1.2 > > > When a large data is written to hbase cluster at the same time,the cluster > load will become very high,which will affect the query performance. This pr > allows data to be written data to hadoop hdfs when doing step “Convert Cuboid > Data to HFile”,and then hfile will be transferred to the hbase cluster by > DistCp。DistCp controls the speed of write data so as to reduce the pressure > of cluster。 This pr adds a new step " HFile Distcp To HBase" between “Convert > Cuboid Data to HFile” and "Load HFile to HBase Table" 。As look like this: > !https://user-images.githubusercontent.com/4843586/100835711-013fae00-34a9-11eb-8de8-e69228ba0991.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)