[ 
https://issues.apache.org/jira/browse/KYLIN-978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14740039#comment-14740039
 ] 

Yerui Sun commented on KYLIN-978:
---------------------------------

In CubingJobBuilder.addCubingSteps,  the intermediateHiveTableLocation and 
factDistinctColumnsPath were added into toDeletePaths. The former path is the 
external location of hive intermediate table, I think that's what you are 
talking about.

I've tested this patch on our environment, it works well. After job completed, 
only cuboids files left on hdfs.

Please check the code and correct me if there's misunderstanding.

> GarbageCollectionStep dropped Hive Intermediate Table but didn't drop 
> external hdfs path
> ----------------------------------------------------------------------------------------
>
>                 Key: KYLIN-978
>                 URL: https://issues.apache.org/jira/browse/KYLIN-978
>             Project: Kylin
>          Issue Type: Bug
>          Components: Job Engine
>    Affects Versions: v1.0, v0.7.2
>            Reporter: Yerui Sun
>            Assignee: Shaofeng SHI
>             Fix For: v1.1
>
>         Attachments: KYLIN-978-1.x-staging-v2.patch
>
>
> In GarbageCollectionStep, the hive intermediate table created in step 1 was 
> dropped. 
> As the table is external table, data was stored in a external hdfs path, like 
> '.../kylin-$\{jobId\}/kylin_intermediate_...', which didn't deleted when drop 
> hive table.
> Considering the purpose of GarbageCollectionStep, the external data path 
> should also be deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to