[jira] [Commented] (KYLIN-4153) Failed to read big resource /dict/xxxx at "Build Dimension Dictionary" Step
[ https://issues.apache.org/jira/browse/KYLIN-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16938620#comment-16938620 ] ASF subversion and git services commented on KYLIN-4153: Commit 9349e4adf50e20b79e3675dd798009039041defb in kylin's branch refs/heads/2.6.x from XiaoxiangYu [ https://gitbox.apache.org/repos/asf?p=kylin.git;h=9349e4a ] KYLIN-4153 Delete marker if real file not exists > Failed to read big resource /dict/ at "Build Dimension Dictionary" Step > > > Key: KYLIN-4153 > URL: https://issues.apache.org/jira/browse/KYLIN-4153 > Project: Kylin > Issue Type: Bug > Components: Metadata >Affects Versions: v2.6.0 >Reporter: Xiaoxiang Yu >Assignee: Xiaoxiang Yu >Priority: Major > Fix For: v3.0.0-beta, v2.6.4 > > > At the version of *Kylin 2.6.0*, kylin team has introduce an important > refactor of Kylin's Metadata Store, which add a lot of enhancement such as > upload/download metadata concurrently, store metadata with JDBC etc. Please > refer to https://issues.apache.org/jira/browse/KYLIN-3671 for detail. > > When kylin want to save a *big resource*(such as dict or snapshot) into > metadata store, it won't store it into metadata store(HBase or RDBMS) > directly. Instead, kylin will first {color:red}save it into HDFS(Step > 1){color}, and then {color:red}write a empty byte array as marker into > metadata store(Step 2) {color}. If first action succeed and second action > failed, a rollback method will be called to revert modification for HDFS > files. We could regard it as a complete and atomic transaction. > > {color:#0747A6}Here is part of the source code added in KYLIN-3671.{color} > Check it at > https://github.com/apache/kylin/blob/8737bc1f555a2789a67462c8f8420b6ab3be97ce/core-common/src/main/java/org/apache/kylin/common/persistence/PushdownResourceStore.java#L58 > . > {code:java} > final void putBigResource(String resPath, ContentWriter content, long newTS) > throws IOException { > // pushdown the big resource to DFS file > RollbackablePushdown pushdown = writePushdown(resPath, content); // Step > 1: write big resource into HDFS > try { > // write a marker in resource store, to indicate the resource is now > available > logger.debug("Writing marker for big resource {}", resPath); > putResourceWithRetry(resPath, > ContentWriter.create(BytesUtil.EMPTY_BYTE_ARRAY), newTS); // Step 2: write > marker into HBase/RDBMS > } catch (Throwable ex) { > pushdown.rollback(); > throw ex; > } finally { > pushdown.close(); > } > } > {code} > > > > But in some case, both step 1 and step 2 succeed but an exception still > throwed in step 2,{color:red} the rollback won't clear marker written in Step > 2{color}, which break the atomicity of this put action, thus cause the > FileNotFoundException when Kylin want to read that dict later. > > > > {color:#0747A6}Here is part of reporter's kylin.log of incomplete rollback > action.{color} > > > {noformat} > 2019-08-29 05:13:51,237 INFO [Scheduler 169045403 Job > ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] dict.DictionaryManager:388 : Saving > dictionary at > /dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict > 2019-08-29 05:13:51,238 DEBUG [Scheduler 169045403 Job > ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:98 : > Writing pushdown file > /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict.temp.-1798610090 > 2019-08-29 05:13:51,256 DEBUG [Scheduler 169045403 Job > ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:117 : > Move > /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict.temp.-1798610090 > to > /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict > 2019-08-29 05:13:51,258 DEBUG [Scheduler 169045403 Job > ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:65 : > Writing marker for big resource > /dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict > 2019-08-29 05:13:56,263 WARN > [hconnection-0x56f3258e-shared--pool10944-t54867] client.AsyncProcess:1263 : > #10545, table=kylin_metadata, attempt=1/1 failed=1ops, last exception: > java.io.IOException: Call to tx-dn41.data/10.14.243.51:60020 failed on local > exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=2662317, > waitTime=5001, operationTimeout=5000 expired. on > tx-dn41.data,60020,1565943919204, tracking started Thu Aug 29 05:1
[jira] [Commented] (KYLIN-4153) Failed to read big resource /dict/xxxx at "Build Dimension Dictionary" Step
[ https://issues.apache.org/jira/browse/KYLIN-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925702#comment-16925702 ] ASF GitHub Bot commented on KYLIN-4153: --- nichunen commented on pull request #818: KYLIN-4153 Delete marker if real file not exists URL: https://github.com/apache/kylin/pull/818 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Failed to read big resource /dict/ at "Build Dimension Dictionary" Step > > > Key: KYLIN-4153 > URL: https://issues.apache.org/jira/browse/KYLIN-4153 > Project: Kylin > Issue Type: Bug > Components: Metadata >Affects Versions: v2.6.0 >Reporter: Xiaoxiang Yu >Assignee: Xiaoxiang Yu >Priority: Major > > At the version of *Kylin 2.6.0*, kylin team has introduce an important > refactor of Kylin's Metadata Store, which add a lot of enhancement such as > upload/download metadata concurrently, store metadata with JDBC etc. Please > refer to https://issues.apache.org/jira/browse/KYLIN-3671 for detail. > > When kylin want to save a *big resource*(such as dict or snapshot) into > metadata store, it won't store it into metadata store(HBase or RDBMS) > directly. Instead, kylin will first {color:red}save it into HDFS(Step > 1){color}, and then {color:red}write a empty byte array as marker into > metadata store(Step 2) {color}. If first action succeed and second action > failed, a rollback method will be called to revert modification for HDFS > files. We could regard it as a complete and atomic transaction. > > {color:#0747A6}Here is part of the source code added in KYLIN-3671.{color} > Check it at > https://github.com/apache/kylin/blob/8737bc1f555a2789a67462c8f8420b6ab3be97ce/core-common/src/main/java/org/apache/kylin/common/persistence/PushdownResourceStore.java#L58 > . > {code:java} > final void putBigResource(String resPath, ContentWriter content, long newTS) > throws IOException { > // pushdown the big resource to DFS file > RollbackablePushdown pushdown = writePushdown(resPath, content); // Step > 1: write big resource into HDFS > try { > // write a marker in resource store, to indicate the resource is now > available > logger.debug("Writing marker for big resource {}", resPath); > putResourceWithRetry(resPath, > ContentWriter.create(BytesUtil.EMPTY_BYTE_ARRAY), newTS); // Step 2: write > marker into HBase/RDBMS > } catch (Throwable ex) { > pushdown.rollback(); > throw ex; > } finally { > pushdown.close(); > } > } > {code} > > > > But in some case, both step 1 and step 2 succeed but an exception still > throwed in step 2,{color:red} the rollback won't clear marker written in Step > 2{color}, which break the atomicity of this put action, thus cause the > FileNotFoundException when Kylin want to read that dict later. > > > > {color:#0747A6}Here is part of reporter's kylin.log of incomplete rollback > action.{color} > > > {noformat} > 2019-08-29 05:13:51,237 INFO [Scheduler 169045403 Job > ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] dict.DictionaryManager:388 : Saving > dictionary at > /dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict > 2019-08-29 05:13:51,238 DEBUG [Scheduler 169045403 Job > ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:98 : > Writing pushdown file > /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict.temp.-1798610090 > 2019-08-29 05:13:51,256 DEBUG [Scheduler 169045403 Job > ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:117 : > Move > /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict.temp.-1798610090 > to > /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict > 2019-08-29 05:13:51,258 DEBUG [Scheduler 169045403 Job > ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:65 : > Writing marker for big resource > /dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict > 2019-08-29 05:13:56,263 WARN > [hconnection-0x56f3258e-shared--pool10944-t54867] client.AsyncProcess:1263 : > #10545, table=kylin_metadata, attempt=1/1 failed=1ops, last exception: > java.io.IOException: Call to tx-dn41.data/10.14.243.51:60020 failed on local > exception: org.apache.hado
[jira] [Commented] (KYLIN-4153) Failed to read big resource /dict/xxxx at "Build Dimension Dictionary" Step
[ https://issues.apache.org/jira/browse/KYLIN-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925703#comment-16925703 ] ASF subversion and git services commented on KYLIN-4153: Commit 7e117e27764dc94cd627b0bd3dc4f4bbbf7f4a3e in kylin's branch refs/heads/master from XiaoxiangYu [ https://gitbox.apache.org/repos/asf?p=kylin.git;h=7e117e2 ] KYLIN-4153 Delete marker if real file not exists > Failed to read big resource /dict/ at "Build Dimension Dictionary" Step > > > Key: KYLIN-4153 > URL: https://issues.apache.org/jira/browse/KYLIN-4153 > Project: Kylin > Issue Type: Bug > Components: Metadata >Affects Versions: v2.6.0 >Reporter: Xiaoxiang Yu >Assignee: Xiaoxiang Yu >Priority: Major > > At the version of *Kylin 2.6.0*, kylin team has introduce an important > refactor of Kylin's Metadata Store, which add a lot of enhancement such as > upload/download metadata concurrently, store metadata with JDBC etc. Please > refer to https://issues.apache.org/jira/browse/KYLIN-3671 for detail. > > When kylin want to save a *big resource*(such as dict or snapshot) into > metadata store, it won't store it into metadata store(HBase or RDBMS) > directly. Instead, kylin will first {color:red}save it into HDFS(Step > 1){color}, and then {color:red}write a empty byte array as marker into > metadata store(Step 2) {color}. If first action succeed and second action > failed, a rollback method will be called to revert modification for HDFS > files. We could regard it as a complete and atomic transaction. > > {color:#0747A6}Here is part of the source code added in KYLIN-3671.{color} > Check it at > https://github.com/apache/kylin/blob/8737bc1f555a2789a67462c8f8420b6ab3be97ce/core-common/src/main/java/org/apache/kylin/common/persistence/PushdownResourceStore.java#L58 > . > {code:java} > final void putBigResource(String resPath, ContentWriter content, long newTS) > throws IOException { > // pushdown the big resource to DFS file > RollbackablePushdown pushdown = writePushdown(resPath, content); // Step > 1: write big resource into HDFS > try { > // write a marker in resource store, to indicate the resource is now > available > logger.debug("Writing marker for big resource {}", resPath); > putResourceWithRetry(resPath, > ContentWriter.create(BytesUtil.EMPTY_BYTE_ARRAY), newTS); // Step 2: write > marker into HBase/RDBMS > } catch (Throwable ex) { > pushdown.rollback(); > throw ex; > } finally { > pushdown.close(); > } > } > {code} > > > > But in some case, both step 1 and step 2 succeed but an exception still > throwed in step 2,{color:red} the rollback won't clear marker written in Step > 2{color}, which break the atomicity of this put action, thus cause the > FileNotFoundException when Kylin want to read that dict later. > > > > {color:#0747A6}Here is part of reporter's kylin.log of incomplete rollback > action.{color} > > > {noformat} > 2019-08-29 05:13:51,237 INFO [Scheduler 169045403 Job > ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] dict.DictionaryManager:388 : Saving > dictionary at > /dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict > 2019-08-29 05:13:51,238 DEBUG [Scheduler 169045403 Job > ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:98 : > Writing pushdown file > /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict.temp.-1798610090 > 2019-08-29 05:13:51,256 DEBUG [Scheduler 169045403 Job > ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:117 : > Move > /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict.temp.-1798610090 > to > /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict > 2019-08-29 05:13:51,258 DEBUG [Scheduler 169045403 Job > ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:65 : > Writing marker for big resource > /dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict > 2019-08-29 05:13:56,263 WARN > [hconnection-0x56f3258e-shared--pool10944-t54867] client.AsyncProcess:1263 : > #10545, table=kylin_metadata, attempt=1/1 failed=1ops, last exception: > java.io.IOException: Call to tx-dn41.data/10.14.243.51:60020 failed on local > exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=2662317, > waitTime=5001, operationTimeout=5000 expired. on > tx-dn41.data,60020,1565943919204, tracking started Thu Aug 29 05:13:51 > GMT+08:00 2019; not retrying 1 - fin
[jira] [Commented] (KYLIN-4153) Failed to read big resource /dict/xxxx at "Build Dimension Dictionary" Step
[ https://issues.apache.org/jira/browse/KYLIN-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920561#comment-16920561 ] Xiaoxiang Yu commented on KYLIN-4153: - Hi, [~Shaofengshi] 1. "although the step 2 throws an exception, the data was actually inserted successfully, is that true? " Yes, it is true. The issue reporter has use scan API to check that item in HBase Shell , that empty byte array does exists. 2. "When rollback, how can it ensure the entry be deleted as well? " Yes, I cannot make the entry will be deleted as expected. So I let Kylin delete that empty entry when cannot find pushdown file. {code:java} protected InputStream openPushdown(String resPath) throws IOException { try { Path p = pushdownPath(resPath); FileSystem fs = pushdownFS(); if (fs.exists(p)) if (fs.exists(p)) { return fs.open(p); else } else { throw new FileNotFoundException(p.toString() + " (FS: " + fs + ")"); } } catch (FileNotFoundException fileNotFound) { logger.error("Marker exists but real file not found, delete marker."); deleteResourceImpl(resPath); // Add here throw new IOException("Failed to read big resource " + resPath, fileNotFound); } catch (Exception ex) { throw new IOException("Failed to read big resource " + resPath, ex); } } {code} > Failed to read big resource /dict/ at "Build Dimension Dictionary" Step > > > Key: KYLIN-4153 > URL: https://issues.apache.org/jira/browse/KYLIN-4153 > Project: Kylin > Issue Type: Bug > Components: Metadata >Affects Versions: v2.6.0 >Reporter: Xiaoxiang Yu >Assignee: Xiaoxiang Yu >Priority: Major > > At the version of *Kylin 2.6.0*, kylin team has introduce an important > refactor of Kylin's Metadata Store, which add a lot of enhancement such as > upload/download metadata concurrently, store metadata with JDBC etc. Please > refer to https://issues.apache.org/jira/browse/KYLIN-3671 for detail. > > When kylin want to save a *big resource*(such as dict or snapshot) into > metadata store, it won't store it into metadata store(HBase or RDBMS) > directly. Instead, kylin will first {color:red}save it into HDFS(Step > 1){color}, and then {color:red}write a empty byte array as marker into > metadata store(Step 2) {color}. If first action succeed and second action > failed, a rollback method will be called to revert modification for HDFS > files. We could regard it as a complete and atomic transaction. > > {color:#0747A6}Here is part of the source code added in KYLIN-3671.{color} > Check it at > https://github.com/apache/kylin/blob/8737bc1f555a2789a67462c8f8420b6ab3be97ce/core-common/src/main/java/org/apache/kylin/common/persistence/PushdownResourceStore.java#L58 > . > {code:java} > final void putBigResource(String resPath, ContentWriter content, long newTS) > throws IOException { > // pushdown the big resource to DFS file > RollbackablePushdown pushdown = writePushdown(resPath, content); // Step > 1: write big resource into HDFS > try { > // write a marker in resource store, to indicate the resource is now > available > logger.debug("Writing marker for big resource {}", resPath); > putResourceWithRetry(resPath, > ContentWriter.create(BytesUtil.EMPTY_BYTE_ARRAY), newTS); // Step 2: write > marker into HBase/RDBMS > } catch (Throwable ex) { > pushdown.rollback(); > throw ex; > } finally { > pushdown.close(); > } > } > {code} > > > > But in some case, both step 1 and step 2 succeed but an exception still > throwed in step 2,{color:red} the rollback won't clear marker written in Step > 2{color}, which break the atomicity of this put action, thus cause the > FileNotFoundException when Kylin want to read that dict later. > > > > {color:#0747A6}Here is part of reporter's kylin.log of incomplete rollback > action.{color} > > > {noformat} > 2019-08-29 05:13:51,237 INFO [Scheduler 169045403 Job > ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] dict.DictionaryManager:388 : Saving > dictionary at > /dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict > 2019-08-29 05:13:51,238 DEBUG [Scheduler 169045403 Job > ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:98 : > Writing pushdown file > /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict.temp.-1798610090 > 2019-08-29 05:13:51,256 DEBUG [Scheduler 169045403 Job > ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:11
[jira] [Commented] (KYLIN-4153) Failed to read big resource /dict/xxxx at "Build Dimension Dictionary" Step
[ https://issues.apache.org/jira/browse/KYLIN-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920546#comment-16920546 ] ASF GitHub Bot commented on KYLIN-4153: --- hit-lacus commented on pull request #818: KYLIN-4153 Delete marker if real file not exists URL: https://github.com/apache/kylin/pull/818 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Failed to read big resource /dict/ at "Build Dimension Dictionary" Step > > > Key: KYLIN-4153 > URL: https://issues.apache.org/jira/browse/KYLIN-4153 > Project: Kylin > Issue Type: Bug > Components: Metadata >Affects Versions: v2.6.0 >Reporter: Xiaoxiang Yu >Assignee: Xiaoxiang Yu >Priority: Major > > At the version of *Kylin 2.6.0*, kylin team has introduce an important > refactor of Kylin's Metadata Store, which add a lot of enhancement such as > upload/download metadata concurrently, store metadata with JDBC etc. Please > refer to https://issues.apache.org/jira/browse/KYLIN-3671 for detail. > > When kylin want to save a *big resource*(such as dict or snapshot) into > metadata store, it won't store it into metadata store(HBase or RDBMS) > directly. Instead, kylin will first {color:red}save it into HDFS(Step > 1){color}, and then {color:red}write a empty byte array as marker into > metadata store(Step 2) {color}. If first action succeed and second action > failed, a rollback method will be called to revert modification for HDFS > files. We could regard it as a complete and atomic transaction. > > {color:#0747A6}Here is part of the source code added in KYLIN-3671.{color} > Check it at > https://github.com/apache/kylin/blob/8737bc1f555a2789a67462c8f8420b6ab3be97ce/core-common/src/main/java/org/apache/kylin/common/persistence/PushdownResourceStore.java#L58 > . > {code:java} > final void putBigResource(String resPath, ContentWriter content, long newTS) > throws IOException { > // pushdown the big resource to DFS file > RollbackablePushdown pushdown = writePushdown(resPath, content); // Step > 1: write big resource into HDFS > try { > // write a marker in resource store, to indicate the resource is now > available > logger.debug("Writing marker for big resource {}", resPath); > putResourceWithRetry(resPath, > ContentWriter.create(BytesUtil.EMPTY_BYTE_ARRAY), newTS); // Step 2: write > marker into HBase/RDBMS > } catch (Throwable ex) { > pushdown.rollback(); > throw ex; > } finally { > pushdown.close(); > } > } > {code} > > > > But in some case, both step 1 and step 2 succeed but an exception still > throwed in step 2,{color:red} the rollback won't clear marker written in Step > 2{color}, which break the atomicity of this put action, thus cause the > FileNotFoundException when Kylin want to read that dict later. > > > > {color:#0747A6}Here is part of reporter's kylin.log of incomplete rollback > action.{color} > > > {noformat} > 2019-08-29 05:13:51,237 INFO [Scheduler 169045403 Job > ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] dict.DictionaryManager:388 : Saving > dictionary at > /dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict > 2019-08-29 05:13:51,238 DEBUG [Scheduler 169045403 Job > ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:98 : > Writing pushdown file > /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict.temp.-1798610090 > 2019-08-29 05:13:51,256 DEBUG [Scheduler 169045403 Job > ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:117 : > Move > /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict.temp.-1798610090 > to > /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict > 2019-08-29 05:13:51,258 DEBUG [Scheduler 169045403 Job > ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:65 : > Writing marker for big resource > /dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict > 2019-08-29 05:13:56,263 WARN > [hconnection-0x56f3258e-shared--pool10944-t54867] client.AsyncProcess:1263 : > #10545, table=kylin_metadata, attempt=1/1 failed=1ops, last exception: > java.io.IOException: Call to tx-dn41.data/10.14.243.51:60020 failed on local > exception: org.apache.had
[jira] [Commented] (KYLIN-4153) Failed to read big resource /dict/xxxx at "Build Dimension Dictionary" Step
[ https://issues.apache.org/jira/browse/KYLIN-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920410#comment-16920410 ] Shaofeng SHI commented on KYLIN-4153: - Hi xiaoxiang, from your observation, although the step 2 throws an exception, the data was actually inserted successfully, is that true? When rollback, how can it ensure the entry be deleted as well? > Failed to read big resource /dict/ at "Build Dimension Dictionary" Step > > > Key: KYLIN-4153 > URL: https://issues.apache.org/jira/browse/KYLIN-4153 > Project: Kylin > Issue Type: Bug > Components: Metadata >Affects Versions: v2.6.0 >Reporter: Xiaoxiang Yu >Assignee: Xiaoxiang Yu >Priority: Major > > At the version of *Kylin 2.6.0*, kylin team has introduce an important > refactor of Kylin's Metadata Store, which add a lot of enhancement such as > upload/download metadata concurrently, store metadata with JDBC etc. Please > refer to https://issues.apache.org/jira/browse/KYLIN-3671 for detail. > > When kylin want to save a *big resource*(such as dict or snapshot) into > metadata store, it won't store it into metadata store(HBase or RDBMS) > directly. Instead, kylin will first {color:red}save it into HDFS(Step > 1){color}, and then {color:red}write a empty byte array as marker into > metadata store(Step 2) {color}. If first action succeed and second action > failed, a rollback method will be called to revert modification for HDFS > files. We could regard it as a complete and atomic transaction. > > {color:#0747A6}Here is part of the source code added in KYLIN-3671.{color} > Check it at > https://github.com/apache/kylin/blob/8737bc1f555a2789a67462c8f8420b6ab3be97ce/core-common/src/main/java/org/apache/kylin/common/persistence/PushdownResourceStore.java#L58 > . > {code:java} > final void putBigResource(String resPath, ContentWriter content, long newTS) > throws IOException { > // pushdown the big resource to DFS file > RollbackablePushdown pushdown = writePushdown(resPath, content); // Step > 1: write big resource into HDFS > try { > // write a marker in resource store, to indicate the resource is now > available > logger.debug("Writing marker for big resource {}", resPath); > putResourceWithRetry(resPath, > ContentWriter.create(BytesUtil.EMPTY_BYTE_ARRAY), newTS); // Step 2: write > marker into HBase/RDBMS > } catch (Throwable ex) { > pushdown.rollback(); > throw ex; > } finally { > pushdown.close(); > } > } > {code} > > > > But in some case, both step 1 and step 2 succeed but an exception still > throwed in step 2,{color:red} the rollback won't clear marker written in Step > 2{color}, which break the atomicity of this put action, thus cause the > FileNotFoundException when Kylin want to read that dict later. > > > > {color:#0747A6}Here is part of reporter's kylin.log of incomplete rollback > action.{color} > > > {noformat} > 2019-08-29 05:13:51,237 INFO [Scheduler 169045403 Job > ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] dict.DictionaryManager:388 : Saving > dictionary at > /dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict > 2019-08-29 05:13:51,238 DEBUG [Scheduler 169045403 Job > ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:98 : > Writing pushdown file > /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict.temp.-1798610090 > 2019-08-29 05:13:51,256 DEBUG [Scheduler 169045403 Job > ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:117 : > Move > /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict.temp.-1798610090 > to > /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict > 2019-08-29 05:13:51,258 DEBUG [Scheduler 169045403 Job > ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:65 : > Writing marker for big resource > /dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict > 2019-08-29 05:13:56,263 WARN > [hconnection-0x56f3258e-shared--pool10944-t54867] client.AsyncProcess:1263 : > #10545, table=kylin_metadata, attempt=1/1 failed=1ops, last exception: > java.io.IOException: Call to tx-dn41.data/10.14.243.51:60020 failed on local > exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=2662317, > waitTime=5001, operationTimeout=5000 expired. on > tx-dn41.data,60020,1565943919204, tracking started Thu Aug 29 05:13:51 > GMT+08:00 2019; not retrying 1 - final failure > 2019-08-29 05:13:56,266 ERROR [Sched