Re: Hdfs Working directory usage

ketan dikshit Wed, 06 Feb 2019 00:58:53 -0800

Hi team,
Any updates on the same ?  

Thanks,
Ketan


> On 01-Feb-2019, at 11:39 AM, ketan dikshit <[email protected]> wrote:
> 
> Hi Team,
> 
> We have a lot of data accumulated in our hdfs-working-directory, so we want 
> to understand the usage of the following job data, once the job has been 
> completed and segment is successfully created. 
> 
> <hdfs-working-dir>/<metdata-name>/<job-id>/<cube-name>/cuboid
> <hdfs-working-dir>/<metdata-name>/<job-id>/<cube-name>/fact_distinct_columns
> <hdfs-working-dir>/<metdata-name>/<job-id>/<cube-name>/hfile
> <hdfs-working-dir>/<metdata-name>/<job-id>/<cube-name>/rowkey_stats
> 
> Basically I need to understand the purpose of: 
> cuboid,fact_distinct_columns,hfile,rowkey_stats after the job has built the 
> cube segment (assuming we don’t use and merging/automerging of segments on 
> the cube later).
> 
> The space taken up by these data in hdfs-working-dir is quite huge(affecting 
> our costing), and is not getting cleaned by by cleanup 
> job(org.apache.kylin.tool.StorageCleanupJob). So we need to be understand, 
> that if we manually clean this up we will not get any issues later.
> 
> Thanks,
> Ketan@Exponential

Re: Hdfs Working directory usage

Reply via email to