[ 
https://issues.apache.org/jira/browse/KYLIN-998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14733295#comment-14733295
 ] 

Shaofeng SHI commented on KYLIN-998:
------------------------------------

Hi Chunen, it seems not checking "delete == true", that means even user specify 
"--delete false", the tables will be dropped from Hive; To be consistent with 
other methods, could you please add that check and generate a new patch? Except 
this, I didn't see other issue; will merge once new patch be uploaded; Thanks 
for your time! 

> Finish the hive intermediate table clean up job in 
> org.apache.kylin.job.hadoop.cube.StorageCleanupJob
> -----------------------------------------------------------------------------------------------------
>
>                 Key: KYLIN-998
>                 URL: https://issues.apache.org/jira/browse/KYLIN-998
>             Project: Kylin
>          Issue Type: Improvement
>          Components: Storage - HBase
>    Affects Versions: v0.7.2, v0.7.1
>            Reporter: nichunen
>            Assignee: Shaofeng SHI
>             Fix For: v1.1
>
>         Attachments: KYLIN-998-0.7-staging-v2.patch, 
> KYLIN-998-0.7-staging.patch, KYLIN-998-0.8-v2.patch, KYLIN-998-0.8.patch, 
> patch1.zip
>
>
> Current kylin has its last cube building job step named “Garbage Collection” 
> to remove the intermediate data in hdfs/hbase/hive. But if the job is 
> accidentally stopped like problem in hadoop cluster, bad cube design, 
> discarded by user, the data was left un-deleted. 
> In such cases, we can run "hbase org.apache.hadoop.util.RunJar 
> $KYLIN_HOME/lib/kylin-job-0.8.1-incubating-SNAPSHOT.jar 
> org.apache.kylin.job.hadoop.cube.StorageCleanupJob --delete true" to remove 
> the data. But the method "cleanUnusedIntermediateHiveTable" is unfinished.
> My first patch is to finish the method, it will remove unused hive tables 
> with names begin with "kylin_intermediate_".
> My second patch add some methods to enable deleting unused data with uuids in 
> command line, or stored in a file.
> I don't know whether the second patch is useful to you, it's used in our 
> kylin server to remove data after one cube is deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to