[ https://issues.apache.org/jira/browse/KYLIN-998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Shaofeng SHI reassigned KYLIN-998: ---------------------------------- Assignee: nichunen (was: Shaofeng SHI) > Finish the hive intermediate table clean up job in > org.apache.kylin.job.hadoop.cube.StorageCleanupJob > ----------------------------------------------------------------------------------------------------- > > Key: KYLIN-998 > URL: https://issues.apache.org/jira/browse/KYLIN-998 > Project: Kylin > Issue Type: Improvement > Components: Storage - HBase > Affects Versions: v0.7.1, v0.7.2 > Reporter: nichunen > Assignee: nichunen > Priority: Major > Fix For: v1.1 > > Attachments: KYLIN-998-0.7-staging-v3.patch, > KYLIN-998-0.7-staging.patch, KYLIN-998-0.8-v3.patch, KYLIN-998-0.8.patch, > KYLIN-998-UUIDS.patch > > > Current kylin has its last cube building job step named “Garbage Collection” > to remove the intermediate data in hdfs/hbase/hive. But if the job is > accidentally stopped like problem in hadoop cluster, bad cube design, > discarded by user, the data was left un-deleted. > In such cases, we can run "hbase org.apache.hadoop.util.RunJar > $KYLIN_HOME/lib/kylin-job-0.8.1-incubating-SNAPSHOT.jar > org.apache.kylin.job.hadoop.cube.StorageCleanupJob --delete true" to remove > the data. But the method "cleanUnusedIntermediateHiveTable" is unfinished. > My first patch is to finish the method, it will remove unused hive tables > with names begin with "kylin_intermediate_". > My second patch add some methods to enable deleting unused data with uuids in > command line, or stored in a file. > I don't know whether the second patch is useful to you, it's used in our > kylin server to remove data after one cube is deleted. -- This message was sent by Atlassian JIRA (v7.6.3#76005)