[jira] [Commented] (KYLIN-4153) Failed to read big resource /dict/xxxx at "Build Dimension Dictionary" Step

2019-09-26 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16938620#comment-16938620
 ] 

ASF subversion and git services commented on KYLIN-4153:


Commit 9349e4adf50e20b79e3675dd798009039041defb in kylin's branch 
refs/heads/2.6.x from XiaoxiangYu
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=9349e4a ]

KYLIN-4153 Delete marker if real file not exists


> Failed to read big resource  /dict/ at "Build Dimension Dictionary" Step
> 
>
> Key: KYLIN-4153
> URL: https://issues.apache.org/jira/browse/KYLIN-4153
> Project: Kylin
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: v2.6.0
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Major
> Fix For: v3.0.0-beta, v2.6.4
>
>
> At the version of *Kylin 2.6.0*, kylin team has introduce an important 
> refactor of Kylin's Metadata Store, which add a lot of enhancement such as 
> upload/download metadata concurrently, store metadata with JDBC etc. Please 
> refer to https://issues.apache.org/jira/browse/KYLIN-3671 for detail.
>  
> When kylin want to save a *big resource*(such as dict or snapshot) into 
> metadata store, it won't store it into metadata store(HBase or RDBMS) 
> directly. Instead, kylin will first {color:red}save it into HDFS(Step 
> 1){color}, and then {color:red}write a empty byte array as marker into 
> metadata store(Step 2) {color}. If first action succeed and second action 
> failed, a rollback method will be called to revert modification for HDFS 
> files. We could regard it as a complete and atomic transaction.
>  
> {color:#0747A6}Here is part of the source code added in KYLIN-3671.{color} 
> Check it at 
> https://github.com/apache/kylin/blob/8737bc1f555a2789a67462c8f8420b6ab3be97ce/core-common/src/main/java/org/apache/kylin/common/persistence/PushdownResourceStore.java#L58
>  . 
> {code:java}
> final void putBigResource(String resPath, ContentWriter content, long newTS) 
> throws IOException {
> // pushdown the big resource to DFS file
> RollbackablePushdown pushdown = writePushdown(resPath, content); // Step 
> 1: write big resource into HDFS
> try {
> // write a marker in resource store, to indicate the resource is now 
> available
> logger.debug("Writing marker for big resource {}", resPath);
> putResourceWithRetry(resPath, 
> ContentWriter.create(BytesUtil.EMPTY_BYTE_ARRAY), newTS); // Step 2: write 
> marker into HBase/RDBMS
> } catch (Throwable ex) {
> pushdown.rollback();
> throw ex;
> } finally {
> pushdown.close();
> }
> }
> {code}
>  
>  
>  
> But in some case, both step 1 and step 2 succeed but an exception still 
> throwed in step 2,{color:red} the rollback won't clear marker written in Step 
> 2{color}, which break the atomicity of this put action, thus cause the 
> FileNotFoundException when Kylin want to read that dict later.
>  
>  
>  
> {color:#0747A6}Here is part of reporter's kylin.log of incomplete rollback 
> action.{color}
>  
>   
> {noformat}
>  2019-08-29 05:13:51,237 INFO  [Scheduler 169045403 Job 
> ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] dict.DictionaryManager:388 : Saving 
> dictionary at 
> /dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict
> 2019-08-29 05:13:51,238 DEBUG [Scheduler 169045403 Job 
> ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:98 : 
> Writing pushdown file 
> /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict.temp.-1798610090
> 2019-08-29 05:13:51,256 DEBUG [Scheduler 169045403 Job 
> ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:117 : 
> Move 
> /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict.temp.-1798610090
>  to 
> /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict
> 2019-08-29 05:13:51,258 DEBUG [Scheduler 169045403 Job 
> ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:65 : 
> Writing marker for big resource 
> /dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict
> 2019-08-29 05:13:56,263 WARN  
> [hconnection-0x56f3258e-shared--pool10944-t54867] client.AsyncProcess:1263 : 
> #10545, table=kylin_metadata, attempt=1/1 failed=1ops, last exception: 
> java.io.IOException: Call to tx-dn41.data/10.14.243.51:60020 failed on local 
> exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=2662317, 
> waitTime=5001, operationTimeout=5000 expired. on 
> tx-dn41.data,60020,1565943919204, tracking started Thu Aug 29 05:1

[jira] [Commented] (KYLIN-4153) Failed to read big resource /dict/xxxx at "Build Dimension Dictionary" Step

2019-09-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925702#comment-16925702
 ] 

ASF GitHub Bot commented on KYLIN-4153:
---

nichunen commented on pull request #818: KYLIN-4153 Delete marker if real file 
not exists
URL: https://github.com/apache/kylin/pull/818
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Failed to read big resource  /dict/ at "Build Dimension Dictionary" Step
> 
>
> Key: KYLIN-4153
> URL: https://issues.apache.org/jira/browse/KYLIN-4153
> Project: Kylin
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: v2.6.0
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Major
>
> At the version of *Kylin 2.6.0*, kylin team has introduce an important 
> refactor of Kylin's Metadata Store, which add a lot of enhancement such as 
> upload/download metadata concurrently, store metadata with JDBC etc. Please 
> refer to https://issues.apache.org/jira/browse/KYLIN-3671 for detail.
>  
> When kylin want to save a *big resource*(such as dict or snapshot) into 
> metadata store, it won't store it into metadata store(HBase or RDBMS) 
> directly. Instead, kylin will first {color:red}save it into HDFS(Step 
> 1){color}, and then {color:red}write a empty byte array as marker into 
> metadata store(Step 2) {color}. If first action succeed and second action 
> failed, a rollback method will be called to revert modification for HDFS 
> files. We could regard it as a complete and atomic transaction.
>  
> {color:#0747A6}Here is part of the source code added in KYLIN-3671.{color} 
> Check it at 
> https://github.com/apache/kylin/blob/8737bc1f555a2789a67462c8f8420b6ab3be97ce/core-common/src/main/java/org/apache/kylin/common/persistence/PushdownResourceStore.java#L58
>  . 
> {code:java}
> final void putBigResource(String resPath, ContentWriter content, long newTS) 
> throws IOException {
> // pushdown the big resource to DFS file
> RollbackablePushdown pushdown = writePushdown(resPath, content); // Step 
> 1: write big resource into HDFS
> try {
> // write a marker in resource store, to indicate the resource is now 
> available
> logger.debug("Writing marker for big resource {}", resPath);
> putResourceWithRetry(resPath, 
> ContentWriter.create(BytesUtil.EMPTY_BYTE_ARRAY), newTS); // Step 2: write 
> marker into HBase/RDBMS
> } catch (Throwable ex) {
> pushdown.rollback();
> throw ex;
> } finally {
> pushdown.close();
> }
> }
> {code}
>  
>  
>  
> But in some case, both step 1 and step 2 succeed but an exception still 
> throwed in step 2,{color:red} the rollback won't clear marker written in Step 
> 2{color}, which break the atomicity of this put action, thus cause the 
> FileNotFoundException when Kylin want to read that dict later.
>  
>  
>  
> {color:#0747A6}Here is part of reporter's kylin.log of incomplete rollback 
> action.{color}
>  
>   
> {noformat}
>  2019-08-29 05:13:51,237 INFO  [Scheduler 169045403 Job 
> ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] dict.DictionaryManager:388 : Saving 
> dictionary at 
> /dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict
> 2019-08-29 05:13:51,238 DEBUG [Scheduler 169045403 Job 
> ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:98 : 
> Writing pushdown file 
> /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict.temp.-1798610090
> 2019-08-29 05:13:51,256 DEBUG [Scheduler 169045403 Job 
> ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:117 : 
> Move 
> /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict.temp.-1798610090
>  to 
> /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict
> 2019-08-29 05:13:51,258 DEBUG [Scheduler 169045403 Job 
> ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:65 : 
> Writing marker for big resource 
> /dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict
> 2019-08-29 05:13:56,263 WARN  
> [hconnection-0x56f3258e-shared--pool10944-t54867] client.AsyncProcess:1263 : 
> #10545, table=kylin_metadata, attempt=1/1 failed=1ops, last exception: 
> java.io.IOException: Call to tx-dn41.data/10.14.243.51:60020 failed on local 
> exception: org.apache.hado

[jira] [Commented] (KYLIN-4153) Failed to read big resource /dict/xxxx at "Build Dimension Dictionary" Step

2019-09-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925703#comment-16925703
 ] 

ASF subversion and git services commented on KYLIN-4153:


Commit 7e117e27764dc94cd627b0bd3dc4f4bbbf7f4a3e in kylin's branch 
refs/heads/master from XiaoxiangYu
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=7e117e2 ]

KYLIN-4153 Delete marker if real file not exists


> Failed to read big resource  /dict/ at "Build Dimension Dictionary" Step
> 
>
> Key: KYLIN-4153
> URL: https://issues.apache.org/jira/browse/KYLIN-4153
> Project: Kylin
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: v2.6.0
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Major
>
> At the version of *Kylin 2.6.0*, kylin team has introduce an important 
> refactor of Kylin's Metadata Store, which add a lot of enhancement such as 
> upload/download metadata concurrently, store metadata with JDBC etc. Please 
> refer to https://issues.apache.org/jira/browse/KYLIN-3671 for detail.
>  
> When kylin want to save a *big resource*(such as dict or snapshot) into 
> metadata store, it won't store it into metadata store(HBase or RDBMS) 
> directly. Instead, kylin will first {color:red}save it into HDFS(Step 
> 1){color}, and then {color:red}write a empty byte array as marker into 
> metadata store(Step 2) {color}. If first action succeed and second action 
> failed, a rollback method will be called to revert modification for HDFS 
> files. We could regard it as a complete and atomic transaction.
>  
> {color:#0747A6}Here is part of the source code added in KYLIN-3671.{color} 
> Check it at 
> https://github.com/apache/kylin/blob/8737bc1f555a2789a67462c8f8420b6ab3be97ce/core-common/src/main/java/org/apache/kylin/common/persistence/PushdownResourceStore.java#L58
>  . 
> {code:java}
> final void putBigResource(String resPath, ContentWriter content, long newTS) 
> throws IOException {
> // pushdown the big resource to DFS file
> RollbackablePushdown pushdown = writePushdown(resPath, content); // Step 
> 1: write big resource into HDFS
> try {
> // write a marker in resource store, to indicate the resource is now 
> available
> logger.debug("Writing marker for big resource {}", resPath);
> putResourceWithRetry(resPath, 
> ContentWriter.create(BytesUtil.EMPTY_BYTE_ARRAY), newTS); // Step 2: write 
> marker into HBase/RDBMS
> } catch (Throwable ex) {
> pushdown.rollback();
> throw ex;
> } finally {
> pushdown.close();
> }
> }
> {code}
>  
>  
>  
> But in some case, both step 1 and step 2 succeed but an exception still 
> throwed in step 2,{color:red} the rollback won't clear marker written in Step 
> 2{color}, which break the atomicity of this put action, thus cause the 
> FileNotFoundException when Kylin want to read that dict later.
>  
>  
>  
> {color:#0747A6}Here is part of reporter's kylin.log of incomplete rollback 
> action.{color}
>  
>   
> {noformat}
>  2019-08-29 05:13:51,237 INFO  [Scheduler 169045403 Job 
> ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] dict.DictionaryManager:388 : Saving 
> dictionary at 
> /dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict
> 2019-08-29 05:13:51,238 DEBUG [Scheduler 169045403 Job 
> ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:98 : 
> Writing pushdown file 
> /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict.temp.-1798610090
> 2019-08-29 05:13:51,256 DEBUG [Scheduler 169045403 Job 
> ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:117 : 
> Move 
> /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict.temp.-1798610090
>  to 
> /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict
> 2019-08-29 05:13:51,258 DEBUG [Scheduler 169045403 Job 
> ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:65 : 
> Writing marker for big resource 
> /dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict
> 2019-08-29 05:13:56,263 WARN  
> [hconnection-0x56f3258e-shared--pool10944-t54867] client.AsyncProcess:1263 : 
> #10545, table=kylin_metadata, attempt=1/1 failed=1ops, last exception: 
> java.io.IOException: Call to tx-dn41.data/10.14.243.51:60020 failed on local 
> exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=2662317, 
> waitTime=5001, operationTimeout=5000 expired. on 
> tx-dn41.data,60020,1565943919204, tracking started Thu Aug 29 05:13:51 
> GMT+08:00 2019; not retrying 1 - fin

[jira] [Commented] (KYLIN-4153) Failed to read big resource /dict/xxxx at "Build Dimension Dictionary" Step

2019-09-01 Thread Xiaoxiang Yu (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920561#comment-16920561
 ] 

Xiaoxiang Yu commented on KYLIN-4153:
-

Hi, [~Shaofengshi]

1. "although the step 2 throws an exception, the data was actually inserted 
successfully, is that true? "
Yes, it is true. The issue reporter has use scan API to check that item in 
HBase Shell , that empty byte array does exists.

2. "When rollback, how can it ensure the entry be deleted as well? "
 Yes, I cannot make the entry will be deleted as expected. So I let Kylin 
delete that empty entry when cannot find pushdown file.

{code:java}
protected InputStream openPushdown(String resPath) throws IOException {
try {
Path p = pushdownPath(resPath);
FileSystem fs = pushdownFS();
if (fs.exists(p))
if (fs.exists(p)) {
return fs.open(p);
else
} else {
throw new FileNotFoundException(p.toString() + "  (FS: " + fs + 
")");
}
} catch (FileNotFoundException fileNotFound) {
logger.error("Marker exists but real file not found, delete 
marker.");
deleteResourceImpl(resPath); // Add here
throw new IOException("Failed to read big resource " + resPath, 
fileNotFound);
} catch (Exception ex) {
throw new IOException("Failed to read big resource " + resPath, ex);
}
}   
{code}

> Failed to read big resource  /dict/ at "Build Dimension Dictionary" Step
> 
>
> Key: KYLIN-4153
> URL: https://issues.apache.org/jira/browse/KYLIN-4153
> Project: Kylin
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: v2.6.0
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Major
>
> At the version of *Kylin 2.6.0*, kylin team has introduce an important 
> refactor of Kylin's Metadata Store, which add a lot of enhancement such as 
> upload/download metadata concurrently, store metadata with JDBC etc. Please 
> refer to https://issues.apache.org/jira/browse/KYLIN-3671 for detail.
>  
> When kylin want to save a *big resource*(such as dict or snapshot) into 
> metadata store, it won't store it into metadata store(HBase or RDBMS) 
> directly. Instead, kylin will first {color:red}save it into HDFS(Step 
> 1){color}, and then {color:red}write a empty byte array as marker into 
> metadata store(Step 2) {color}. If first action succeed and second action 
> failed, a rollback method will be called to revert modification for HDFS 
> files. We could regard it as a complete and atomic transaction.
>  
> {color:#0747A6}Here is part of the source code added in KYLIN-3671.{color} 
> Check it at 
> https://github.com/apache/kylin/blob/8737bc1f555a2789a67462c8f8420b6ab3be97ce/core-common/src/main/java/org/apache/kylin/common/persistence/PushdownResourceStore.java#L58
>  . 
> {code:java}
> final void putBigResource(String resPath, ContentWriter content, long newTS) 
> throws IOException {
> // pushdown the big resource to DFS file
> RollbackablePushdown pushdown = writePushdown(resPath, content); // Step 
> 1: write big resource into HDFS
> try {
> // write a marker in resource store, to indicate the resource is now 
> available
> logger.debug("Writing marker for big resource {}", resPath);
> putResourceWithRetry(resPath, 
> ContentWriter.create(BytesUtil.EMPTY_BYTE_ARRAY), newTS); // Step 2: write 
> marker into HBase/RDBMS
> } catch (Throwable ex) {
> pushdown.rollback();
> throw ex;
> } finally {
> pushdown.close();
> }
> }
> {code}
>  
>  
>  
> But in some case, both step 1 and step 2 succeed but an exception still 
> throwed in step 2,{color:red} the rollback won't clear marker written in Step 
> 2{color}, which break the atomicity of this put action, thus cause the 
> FileNotFoundException when Kylin want to read that dict later.
>  
>  
>  
> {color:#0747A6}Here is part of reporter's kylin.log of incomplete rollback 
> action.{color}
>  
>   
> {noformat}
>  2019-08-29 05:13:51,237 INFO  [Scheduler 169045403 Job 
> ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] dict.DictionaryManager:388 : Saving 
> dictionary at 
> /dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict
> 2019-08-29 05:13:51,238 DEBUG [Scheduler 169045403 Job 
> ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:98 : 
> Writing pushdown file 
> /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict.temp.-1798610090
> 2019-08-29 05:13:51,256 DEBUG [Scheduler 169045403 Job 
> ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:11

[jira] [Commented] (KYLIN-4153) Failed to read big resource /dict/xxxx at "Build Dimension Dictionary" Step

2019-09-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920546#comment-16920546
 ] 

ASF GitHub Bot commented on KYLIN-4153:
---

hit-lacus commented on pull request #818: KYLIN-4153 Delete marker if real file 
not exists
URL: https://github.com/apache/kylin/pull/818
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Failed to read big resource  /dict/ at "Build Dimension Dictionary" Step
> 
>
> Key: KYLIN-4153
> URL: https://issues.apache.org/jira/browse/KYLIN-4153
> Project: Kylin
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: v2.6.0
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Major
>
> At the version of *Kylin 2.6.0*, kylin team has introduce an important 
> refactor of Kylin's Metadata Store, which add a lot of enhancement such as 
> upload/download metadata concurrently, store metadata with JDBC etc. Please 
> refer to https://issues.apache.org/jira/browse/KYLIN-3671 for detail.
>  
> When kylin want to save a *big resource*(such as dict or snapshot) into 
> metadata store, it won't store it into metadata store(HBase or RDBMS) 
> directly. Instead, kylin will first {color:red}save it into HDFS(Step 
> 1){color}, and then {color:red}write a empty byte array as marker into 
> metadata store(Step 2) {color}. If first action succeed and second action 
> failed, a rollback method will be called to revert modification for HDFS 
> files. We could regard it as a complete and atomic transaction.
>  
> {color:#0747A6}Here is part of the source code added in KYLIN-3671.{color} 
> Check it at 
> https://github.com/apache/kylin/blob/8737bc1f555a2789a67462c8f8420b6ab3be97ce/core-common/src/main/java/org/apache/kylin/common/persistence/PushdownResourceStore.java#L58
>  . 
> {code:java}
> final void putBigResource(String resPath, ContentWriter content, long newTS) 
> throws IOException {
> // pushdown the big resource to DFS file
> RollbackablePushdown pushdown = writePushdown(resPath, content); // Step 
> 1: write big resource into HDFS
> try {
> // write a marker in resource store, to indicate the resource is now 
> available
> logger.debug("Writing marker for big resource {}", resPath);
> putResourceWithRetry(resPath, 
> ContentWriter.create(BytesUtil.EMPTY_BYTE_ARRAY), newTS); // Step 2: write 
> marker into HBase/RDBMS
> } catch (Throwable ex) {
> pushdown.rollback();
> throw ex;
> } finally {
> pushdown.close();
> }
> }
> {code}
>  
>  
>  
> But in some case, both step 1 and step 2 succeed but an exception still 
> throwed in step 2,{color:red} the rollback won't clear marker written in Step 
> 2{color}, which break the atomicity of this put action, thus cause the 
> FileNotFoundException when Kylin want to read that dict later.
>  
>  
>  
> {color:#0747A6}Here is part of reporter's kylin.log of incomplete rollback 
> action.{color}
>  
>   
> {noformat}
>  2019-08-29 05:13:51,237 INFO  [Scheduler 169045403 Job 
> ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] dict.DictionaryManager:388 : Saving 
> dictionary at 
> /dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict
> 2019-08-29 05:13:51,238 DEBUG [Scheduler 169045403 Job 
> ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:98 : 
> Writing pushdown file 
> /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict.temp.-1798610090
> 2019-08-29 05:13:51,256 DEBUG [Scheduler 169045403 Job 
> ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:117 : 
> Move 
> /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict.temp.-1798610090
>  to 
> /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict
> 2019-08-29 05:13:51,258 DEBUG [Scheduler 169045403 Job 
> ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:65 : 
> Writing marker for big resource 
> /dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict
> 2019-08-29 05:13:56,263 WARN  
> [hconnection-0x56f3258e-shared--pool10944-t54867] client.AsyncProcess:1263 : 
> #10545, table=kylin_metadata, attempt=1/1 failed=1ops, last exception: 
> java.io.IOException: Call to tx-dn41.data/10.14.243.51:60020 failed on local 
> exception: org.apache.had

[jira] [Commented] (KYLIN-4153) Failed to read big resource /dict/xxxx at "Build Dimension Dictionary" Step

2019-09-01 Thread Shaofeng SHI (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920410#comment-16920410
 ] 

Shaofeng SHI commented on KYLIN-4153:
-

Hi xiaoxiang, from your observation, although the step 2 throws an exception, 
the data was actually inserted successfully, is that true? 

When rollback, how can it ensure the entry be deleted as well? 

> Failed to read big resource  /dict/ at "Build Dimension Dictionary" Step
> 
>
> Key: KYLIN-4153
> URL: https://issues.apache.org/jira/browse/KYLIN-4153
> Project: Kylin
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: v2.6.0
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Major
>
> At the version of *Kylin 2.6.0*, kylin team has introduce an important 
> refactor of Kylin's Metadata Store, which add a lot of enhancement such as 
> upload/download metadata concurrently, store metadata with JDBC etc. Please 
> refer to https://issues.apache.org/jira/browse/KYLIN-3671 for detail.
>  
> When kylin want to save a *big resource*(such as dict or snapshot) into 
> metadata store, it won't store it into metadata store(HBase or RDBMS) 
> directly. Instead, kylin will first {color:red}save it into HDFS(Step 
> 1){color}, and then {color:red}write a empty byte array as marker into 
> metadata store(Step 2) {color}. If first action succeed and second action 
> failed, a rollback method will be called to revert modification for HDFS 
> files. We could regard it as a complete and atomic transaction.
>  
> {color:#0747A6}Here is part of the source code added in KYLIN-3671.{color} 
> Check it at 
> https://github.com/apache/kylin/blob/8737bc1f555a2789a67462c8f8420b6ab3be97ce/core-common/src/main/java/org/apache/kylin/common/persistence/PushdownResourceStore.java#L58
>  . 
> {code:java}
> final void putBigResource(String resPath, ContentWriter content, long newTS) 
> throws IOException {
> // pushdown the big resource to DFS file
> RollbackablePushdown pushdown = writePushdown(resPath, content); // Step 
> 1: write big resource into HDFS
> try {
> // write a marker in resource store, to indicate the resource is now 
> available
> logger.debug("Writing marker for big resource {}", resPath);
> putResourceWithRetry(resPath, 
> ContentWriter.create(BytesUtil.EMPTY_BYTE_ARRAY), newTS); // Step 2: write 
> marker into HBase/RDBMS
> } catch (Throwable ex) {
> pushdown.rollback();
> throw ex;
> } finally {
> pushdown.close();
> }
> }
> {code}
>  
>  
>  
> But in some case, both step 1 and step 2 succeed but an exception still 
> throwed in step 2,{color:red} the rollback won't clear marker written in Step 
> 2{color}, which break the atomicity of this put action, thus cause the 
> FileNotFoundException when Kylin want to read that dict later.
>  
>  
>  
> {color:#0747A6}Here is part of reporter's kylin.log of incomplete rollback 
> action.{color}
>  
>   
> {noformat}
>  2019-08-29 05:13:51,237 INFO  [Scheduler 169045403 Job 
> ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] dict.DictionaryManager:388 : Saving 
> dictionary at 
> /dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict
> 2019-08-29 05:13:51,238 DEBUG [Scheduler 169045403 Job 
> ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:98 : 
> Writing pushdown file 
> /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict.temp.-1798610090
> 2019-08-29 05:13:51,256 DEBUG [Scheduler 169045403 Job 
> ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:117 : 
> Move 
> /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict.temp.-1798610090
>  to 
> /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict
> 2019-08-29 05:13:51,258 DEBUG [Scheduler 169045403 Job 
> ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:65 : 
> Writing marker for big resource 
> /dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict
> 2019-08-29 05:13:56,263 WARN  
> [hconnection-0x56f3258e-shared--pool10944-t54867] client.AsyncProcess:1263 : 
> #10545, table=kylin_metadata, attempt=1/1 failed=1ops, last exception: 
> java.io.IOException: Call to tx-dn41.data/10.14.243.51:60020 failed on local 
> exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=2662317, 
> waitTime=5001, operationTimeout=5000 expired. on 
> tx-dn41.data,60020,1565943919204, tracking started Thu Aug 29 05:13:51 
> GMT+08:00 2019; not retrying 1 - final failure
> 2019-08-29 05:13:56,266 ERROR [Sched