[ https://issues.apache.org/jira/browse/KYLIN-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16614204#comment-16614204 ]
Shaofeng SHI commented on KYLIN-3555: ------------------------------------- The "kylin.storage.hbase.cluster-fs" need to be a file system uri, not an alsolute path. In this case, as you use S3 for both, so "kylin.storage.hbase.cluster-fs" can be empty. > Garbage collection on HBase step fails with S3 selected as storage > ------------------------------------------------------------------ > > Key: KYLIN-3555 > URL: https://issues.apache.org/jira/browse/KYLIN-3555 > Project: Kylin > Issue Type: Bug > Components: Job Engine > Affects Versions: v2.4.1 > Reporter: Iñigo Martinez > Priority: Major > Labels: build > Attachments: Screenshot from 2018-09-11 12-31-25.png > > > When building a cube with S3 selected has storage, build process fails at > latest step. > Although s3 has been defined as storage, cleanup task tries to delete from > HDFS and, of course, there is no file at HDFS. > > {code:java} > 2018-09-11 12:27:56,311 DEBUG [Scheduler 1407846257 Job > f8416975-eea6-4500-9cb7-4374f28451dc-237] > steps.HDFSPathGarbageCollectionStep:78 : Drop HDFS path on FileSystem: > s3://XXXXXXX-emr-kylin > 2018-09-11 12:27:57,364 DEBUG [Scheduler 1407846257 Job > f8416975-eea6-4500-9cb7-4374f28451dc-237] > steps.HDFSPathGarbageCollectionStep:87 : HDFS path > /kylin/kylin_metadata/kylin-f8416975-eea6-4500-9cb7-4374f28451dc/plataforma_transacciones_cubo_v1/fact_distinct_columns > is dropped. > 2018-09-11 12:27:58,104 DEBUG [Scheduler 1407846257 Job > f8416975-eea6-4500-9cb7-4374f28451dc-237] > steps.HDFSPathGarbageCollectionStep:87 : HDFS path > /kylin/kylin_metadata/kylin-f8416975-eea6-4500-9cb7-4374f28451dc/plataforma_transacciones_cubo_v1/hfile > is dropped. > 2018-09-11 12:27:58,140 DEBUG [Scheduler 1407846257 Job > f8416975-eea6-4500-9cb7-4374f28451dc-237] > steps.HDFSPathGarbageCollectionStep:78 : Drop HDFS path on FileSystem: > hdfs://ip-10-0-1-63.eu-west-1.compute.internal:8020 > 2018-09-11 12:27:58,142 DEBUG [Scheduler 1407846257 Job > f8416975-eea6-4500-9cb7-4374f28451dc-237] > steps.HDFSPathGarbageCollectionStep:90 : HDFS path > /kylin/kylin_metadata/kylin-f8416975-eea6-4500-9cb7-4374f28451dc/plataforma_transacciones_cubo_v1/fact_distinct_columns > not exists. > 2018-09-11 12:27:58,147 ERROR [Scheduler 1407846257 Job > f8416975-eea6-4500-9cb7-4374f28451dc-237] > steps.HDFSPathGarbageCollectionStep:68 : > job:f8416975-eea6-4500-9cb7-4374f28451dc-15 execute finished with exception > java.io.FileNotFoundException: File > /kylin/kylin_metadata/kylin-f8416975-eea6-4500-9cb7-4374f28451dc/plataforma_transacciones_cubo_v1 > does not exist. > at > org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:904) > at > org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:114) > at > org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:964) > at > org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:961) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:971) > at > org.apache.kylin.storage.hbase.steps.HDFSPathGarbageCollectionStep.dropHdfsPathOnCluster(HDFSPathGarbageCollectionStep.java:95) > at > org.apache.kylin.storage.hbase.steps.HDFSPathGarbageCollectionStep.doWork(HDFSPathGarbageCollectionStep.java:65) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:162) > at > org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:69) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:162) > at > org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:113) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748){code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)