Hi Team, We have a lot of data accumulated in our hdfs-working-directory, so we want to understand the usage of the following job data, once the job has been completed and segment is successfully created.
<hdfs-working-dir>/<metdata-name>/<job-id>/<cube-name>/cuboid <hdfs-working-dir>/<metdata-name>/<job-id>/<cube-name>/fact_distinct_columns <hdfs-working-dir>/<metdata-name>/<job-id>/<cube-name>/hfile <hdfs-working-dir>/<metdata-name>/<job-id>/<cube-name>/rowkey_stats Basically I need to understand the purpose of: cuboid,fact_distinct_columns,hfile,rowkey_stats after the job has built the cube segment (assuming we don’t use and merging/automerging of segments on the cube later). The space taken up by these data in hdfs-working-dir is quite huge(affecting our costing), and is not getting cleaned by by cleanup job(org.apache.kylin.tool.StorageCleanupJob). So we need to be understand, that if we manually clean this up we will not get any issues later. Thanks, Ketan@Exponential