[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=302533&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-302533
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 28/Aug/19 05:26
Start Date: 28/Aug/19 05:26
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on issue #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#issuecomment-525589669
 
 
   @bshashikant Thanks for working on the PR! I have merged it with trunk.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 302533)
Time Spent: 5.5h  (was: 5h 20m)

> Datanode unable to find chunk while replication data using ratis.
> -
>
> Key: HDDS-1753
> URL: https://issues.apache.org/jira/browse/HDDS-1753
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: HDDS-1753.000.patch
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> Leader datanode is unable to read chunk from the datanode while replicating 
> data from leader to follower.
> Please note that deletion of keys is also happening while the data is being 
> replicated.
> {code}
> 2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl 
> (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
> -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 1)
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler 
> (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace 
> ID: 4216d461a4679e17:4216d461a4679e17:0:0 : Message: Unable to find the c
> hunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048} : Result: UNABLE_TO_FIND_CHUNK
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 2)
> 2019-07-02 19:39:22,606 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#72:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 19:39:22.606 [pool-195-thread-19] ERROR DNAudit - user=null | ip=null | 
> op=READ_CHUNK {blockData=conID: 3 locID: 102372189549953034 bcsId: 0} | 
> ret=FAILURE
> java.lang.Exception: Unable to find the chunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048}
> at 
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:320)
>  ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:148)
>  ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatch

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=302532&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-302532
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 28/Aug/19 05:24
Start Date: 28/Aug/19 05:24
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 302532)
Time Spent: 5h 20m  (was: 5h 10m)

> Datanode unable to find chunk while replication data using ratis.
> -
>
> Key: HDDS-1753
> URL: https://issues.apache.org/jira/browse/HDDS-1753
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: HDDS-1753.000.patch
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> Leader datanode is unable to read chunk from the datanode while replicating 
> data from leader to follower.
> Please note that deletion of keys is also happening while the data is being 
> replicated.
> {code}
> 2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl 
> (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
> -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 1)
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler 
> (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace 
> ID: 4216d461a4679e17:4216d461a4679e17:0:0 : Message: Unable to find the c
> hunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048} : Result: UNABLE_TO_FIND_CHUNK
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 2)
> 2019-07-02 19:39:22,606 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#72:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 19:39:22.606 [pool-195-thread-19] ERROR DNAudit - user=null | ip=null | 
> op=READ_CHUNK {blockData=conID: 3 locID: 102372189549953034 bcsId: 0} | 
> ret=FAILURE
> java.lang.Exception: Unable to find the chunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048}
> at 
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:320)
>  ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:148)
>  ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCommand(ContainerStateMachine.java:346)
>  ~[hadoop-hdds-container-service-0.5.0-SN

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=302454&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-302454
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 27/Aug/19 23:03
Start Date: 27/Aug/19 23:03
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#issuecomment-525516258
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 84 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 0 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 6 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 25 | Maven dependency ordering for branch |
   | +1 | mvninstall | 655 | trunk passed |
   | +1 | compile | 396 | trunk passed |
   | +1 | checkstyle | 81 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 1000 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 168 | trunk passed |
   | 0 | spotbugs | 440 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | +1 | findbugs | 646 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 30 | Maven dependency ordering for patch |
   | +1 | mvninstall | 581 | the patch passed |
   | +1 | compile | 390 | the patch passed |
   | +1 | javac | 390 | the patch passed |
   | +1 | checkstyle | 73 | the patch passed |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 731 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 183 | the patch passed |
   | +1 | findbugs | 709 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 346 | hadoop-hdds in the patch passed. |
   | -1 | unit | 2332 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 43 | The patch does not generate ASF License warnings. |
   | | | 8629 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures |
   |   | hadoop.ozone.om.TestOzoneManagerHA |
   |   | hadoop.ozone.client.rpc.TestOzoneRpcClientForAclAuditLog |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=19.03.0 Server=19.03.0 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/6/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1318 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux ef6935296bb3 4.15.0-52-generic #56-Ubuntu SMP Tue Jun 4 
22:49:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 66cfa48 |
   | Default Java | 1.8.0_212 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/6/artifact/out/patch-unit-hadoop-ozone.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/6/testReport/ |
   | Max. process+thread count | 5366 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/common hadoop-hdds/container-service 
hadoop-ozone/integration-test U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/6/console |
   | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 |
   | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 302454)
Time Spent: 5h 10m  (was: 5h)

> Datanode unable to find chunk while replication data using ratis.
> -
>
> Key: HDDS-1753
> URL: https://issues.apache.org/jira/browse/HDDS-1753
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-availa

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=30&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-30
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 27/Aug/19 17:04
Start Date: 27/Aug/19 17:04
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on issue #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#issuecomment-525393837
 
 
   @bshashikant Thanks for updating the PR! The changes look good to me. +1.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 30)
Time Spent: 5h  (was: 4h 50m)

> Datanode unable to find chunk while replication data using ratis.
> -
>
> Key: HDDS-1753
> URL: https://issues.apache.org/jira/browse/HDDS-1753
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: HDDS-1753.000.patch
>
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Leader datanode is unable to read chunk from the datanode while replicating 
> data from leader to follower.
> Please note that deletion of keys is also happening while the data is being 
> replicated.
> {code}
> 2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl 
> (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
> -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 1)
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler 
> (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace 
> ID: 4216d461a4679e17:4216d461a4679e17:0:0 : Message: Unable to find the c
> hunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048} : Result: UNABLE_TO_FIND_CHUNK
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 2)
> 2019-07-02 19:39:22,606 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#72:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 19:39:22.606 [pool-195-thread-19] ERROR DNAudit - user=null | ip=null | 
> op=READ_CHUNK {blockData=conID: 3 locID: 102372189549953034 bcsId: 0} | 
> ret=FAILURE
> java.lang.Exception: Unable to find the chunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048}
> at 
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:320)
>  ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:148)
>  ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCo

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=301943&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-301943
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 27/Aug/19 12:47
Start Date: 27/Aug/19 12:47
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#discussion_r318059228
 
 

 ##
 File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/container/common/TestBlockDeletingService.java
 ##
 @@ -195,8 +198,13 @@ public void testBlockDeletion() throws Exception {
 ContainerSet containerSet = new ContainerSet();
 createToDeleteBlocks(containerSet, conf, 1, 3, 1);
 
+OzoneContainer ozoneContainer = Mockito.mock(OzoneContainer.class);
+Mockito.when(ozoneContainer.getContainerSet())
+.thenReturn(containerSet);
+Mockito.when(ozoneContainer.getWriteChannel())
+.thenReturn(null);
 
 Review comment:
   Addressed in the latest patch.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 301943)
Time Spent: 4.5h  (was: 4h 20m)

> Datanode unable to find chunk while replication data using ratis.
> -
>
> Key: HDDS-1753
> URL: https://issues.apache.org/jira/browse/HDDS-1753
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: HDDS-1753.000.patch
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Leader datanode is unable to read chunk from the datanode while replicating 
> data from leader to follower.
> Please note that deletion of keys is also happening while the data is being 
> replicated.
> {code}
> 2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl 
> (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
> -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 1)
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler 
> (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace 
> ID: 4216d461a4679e17:4216d461a4679e17:0:0 : Message: Unable to find the c
> hunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048} : Result: UNABLE_TO_FIND_CHUNK
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 2)
> 2019-07-02 19:39:22,606 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#72:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 19:39:22.606 [pool-195-thread-19] ERROR DNAudit - user=null | ip=null | 
> op=READ_CHUNK {blockData=conID: 3 locID: 102372189549953034 bcsId: 0} | 
> ret=FAILURE
> java.lang.Exception: Unable to find the chunk file. chunk info 
> ChunkInfo{chunkName='

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=301945&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-301945
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 27/Aug/19 12:47
Start Date: 27/Aug/19 12:47
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#discussion_r318059374
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/statemachine/background/BlockDeletingService.java
 ##
 @@ -143,6 +150,52 @@ public BackgroundTaskQueue getTasks() {
 return queue;
   }
 
+  public List chooseContainerForBlockDeletion(int count,
+  ContainerDeletionChoosingPolicy deletionPolicy)
+  throws StorageContainerException {
+Map containerDataMap =
+ozoneContainer.getContainerSet().getContainerMap().entrySet().stream()
+.filter(e -> isDeletionAllowed(e.getValue().getContainerData(),
+deletionPolicy)).collect(Collectors
+.toMap(Map.Entry::getKey, e -> e.getValue().getContainerData()));
+return deletionPolicy
+.chooseContainerForBlockDeletion(count, containerDataMap);
+  }
+
+  private boolean isDeletionAllowed(ContainerData containerData,
+  ContainerDeletionChoosingPolicy deletionPolicy) {
+if (!deletionPolicy
+.isValidContainerType(containerData.getContainerType())) {
+  return false;
+} else if (!containerData.isClosed()) {
+  return false;
+} else {
+  if (ozoneContainer.getWriteChannel() instanceof XceiverServerRatis) {
+try {
+  XceiverServerRatis ratisServer =
+  (XceiverServerRatis) ozoneContainer.getWriteChannel();
+  long minReplicatedIndex = 
ratisServer.getMinReplicatedIndex(PipelineID
+  .valueOf(UUID.fromString(containerData.getOriginPipelineId(;
 
 Review comment:
   Addressed in the latest patch.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 301945)
Time Spent: 4h 50m  (was: 4h 40m)

> Datanode unable to find chunk while replication data using ratis.
> -
>
> Key: HDDS-1753
> URL: https://issues.apache.org/jira/browse/HDDS-1753
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: HDDS-1753.000.patch
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Leader datanode is unable to read chunk from the datanode while replicating 
> data from leader to follower.
> Please note that deletion of keys is also happening while the data is being 
> replicated.
> {code}
> 2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl 
> (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
> -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 1)
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler 
> (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace 
> ID: 4216d461a4679e17:4216d461a4679e17:0:0 : Message: Unable to find the c
> hunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fd

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=301944&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-301944
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 27/Aug/19 12:47
Start Date: 27/Aug/19 12:47
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#discussion_r318059242
 
 

 ##
 File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestDeleteWithSlowFollower.java
 ##
 @@ -0,0 +1,288 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations 
under
+ * the License.
+ */
+
+package org.apache.hadoop.ozone.client.rpc;
+
+import org.apache.hadoop.hdds.client.BlockID;
+import org.apache.hadoop.hdds.client.ReplicationFactor;
+import org.apache.hadoop.hdds.client.ReplicationType;
+import org.apache.hadoop.hdds.conf.OzoneConfiguration;
+import org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos;
+import org.apache.hadoop.hdds.protocol.proto.HddsProtos;
+import org.apache.hadoop.hdds.scm.ScmConfigKeys;
+import org.apache.hadoop.hdds.scm.XceiverClientManager;
+import org.apache.hadoop.hdds.scm.XceiverClientSpi;
+import 
org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException;
+import org.apache.hadoop.hdds.scm.pipeline.Pipeline;
+import org.apache.hadoop.ozone.HddsDatanodeService;
+import org.apache.hadoop.ozone.MiniOzoneCluster;
+import org.apache.hadoop.ozone.OzoneConfigKeys;
+import org.apache.hadoop.ozone.client.ObjectStore;
+import org.apache.hadoop.ozone.client.OzoneClient;
+import org.apache.hadoop.ozone.client.OzoneClientFactory;
+import org.apache.hadoop.ozone.client.io.KeyOutputStream;
+import org.apache.hadoop.ozone.client.io.OzoneOutputStream;
+import org.apache.hadoop.ozone.container.ContainerTestHelper;
+import org.apache.hadoop.ozone.container.common.helpers.BlockData;
+import org.apache.hadoop.ozone.container.common.helpers.ChunkInfo;
+import org.apache.hadoop.ozone.container.common.interfaces.Container;
+import 
org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine;
+import 
org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine;
+import org.apache.hadoop.ozone.container.keyvalue.KeyValueContainerData;
+import org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler;
+import org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer;
+import org.apache.hadoop.ozone.om.helpers.OmKeyArgs;
+import org.apache.hadoop.ozone.om.helpers.OmKeyInfo;
+import org.apache.hadoop.ozone.om.helpers.OmKeyLocationInfo;
+import org.apache.hadoop.test.GenericTestUtils;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.HashMap;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+
+import static 
org.apache.hadoop.hdds.HddsConfigKeys.HDDS_COMMAND_STATUS_REPORT_INTERVAL;
+import static 
org.apache.hadoop.hdds.HddsConfigKeys.HDDS_CONTAINER_REPORT_INTERVAL;
+import static 
org.apache.hadoop.hdds.scm.ScmConfigKeys.HDDS_SCM_WATCHER_TIMEOUT;
+import static 
org.apache.hadoop.hdds.scm.ScmConfigKeys.OZONE_SCM_PIPELINE_DESTROY_TIMEOUT;
+import static 
org.apache.hadoop.hdds.scm.ScmConfigKeys.OZONE_SCM_STALENODE_INTERVAL;
+
+/**
+ * Tests delete key operation with a slow follower in the datanode
+ * pipeline.
+ */
+public class TestDeleteWithSlowFollower {
+
+  private static MiniOzoneCluster cluster;
+  private static OzoneConfiguration conf;
+  private static OzoneClient client;
+  private static ObjectStore objectStore;
+  private static String volumeName;
+  private static String bucketName;
+  private static String path;
+  private static XceiverClientManager xceiverClientManager;
+
+  /**
+   * Create a MiniDFSCluster for testing.
+   *
+   * @throws IOException
+   */
+  @BeforeClass
+  public static void init() throws Exception {
+conf = new OzoneConfiguration();
+path = Gener

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=301835&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-301835
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 27/Aug/19 09:46
Start Date: 27/Aug/19 09:46
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#discussion_r317987673
 
 

 ##
 File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestDeleteWithSlowFollower.java
 ##
 @@ -0,0 +1,288 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations 
under
+ * the License.
+ */
+
+package org.apache.hadoop.ozone.client.rpc;
+
+import org.apache.hadoop.hdds.client.BlockID;
+import org.apache.hadoop.hdds.client.ReplicationFactor;
+import org.apache.hadoop.hdds.client.ReplicationType;
+import org.apache.hadoop.hdds.conf.OzoneConfiguration;
+import org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos;
+import org.apache.hadoop.hdds.protocol.proto.HddsProtos;
+import org.apache.hadoop.hdds.scm.ScmConfigKeys;
+import org.apache.hadoop.hdds.scm.XceiverClientManager;
+import org.apache.hadoop.hdds.scm.XceiverClientSpi;
+import 
org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException;
+import org.apache.hadoop.hdds.scm.pipeline.Pipeline;
+import org.apache.hadoop.ozone.HddsDatanodeService;
+import org.apache.hadoop.ozone.MiniOzoneCluster;
+import org.apache.hadoop.ozone.OzoneConfigKeys;
+import org.apache.hadoop.ozone.client.ObjectStore;
+import org.apache.hadoop.ozone.client.OzoneClient;
+import org.apache.hadoop.ozone.client.OzoneClientFactory;
+import org.apache.hadoop.ozone.client.io.KeyOutputStream;
+import org.apache.hadoop.ozone.client.io.OzoneOutputStream;
+import org.apache.hadoop.ozone.container.ContainerTestHelper;
+import org.apache.hadoop.ozone.container.common.helpers.BlockData;
+import org.apache.hadoop.ozone.container.common.helpers.ChunkInfo;
+import org.apache.hadoop.ozone.container.common.interfaces.Container;
+import 
org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine;
+import 
org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine;
+import org.apache.hadoop.ozone.container.keyvalue.KeyValueContainerData;
+import org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler;
+import org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer;
+import org.apache.hadoop.ozone.om.helpers.OmKeyArgs;
+import org.apache.hadoop.ozone.om.helpers.OmKeyInfo;
+import org.apache.hadoop.ozone.om.helpers.OmKeyLocationInfo;
+import org.apache.hadoop.test.GenericTestUtils;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.HashMap;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+
+import static 
org.apache.hadoop.hdds.HddsConfigKeys.HDDS_COMMAND_STATUS_REPORT_INTERVAL;
+import static 
org.apache.hadoop.hdds.HddsConfigKeys.HDDS_CONTAINER_REPORT_INTERVAL;
+import static 
org.apache.hadoop.hdds.scm.ScmConfigKeys.HDDS_SCM_WATCHER_TIMEOUT;
+import static 
org.apache.hadoop.hdds.scm.ScmConfigKeys.OZONE_SCM_PIPELINE_DESTROY_TIMEOUT;
+import static 
org.apache.hadoop.hdds.scm.ScmConfigKeys.OZONE_SCM_STALENODE_INTERVAL;
+
+/**
+ * Tests delete key operation with a slow follower in the datanode
+ * pipeline.
+ */
+public class TestDeleteWithSlowFollower {
+
+  private static MiniOzoneCluster cluster;
+  private static OzoneConfiguration conf;
+  private static OzoneClient client;
+  private static ObjectStore objectStore;
+  private static String volumeName;
+  private static String bucketName;
+  private static String path;
+  private static XceiverClientManager xceiverClientManager;
+
+  /**
+   * Create a MiniDFSCluster for testing.
+   *
+   * @throws IOException
+   */
+  @BeforeClass
+  public static void init() throws Exception {
+conf = new OzoneConfiguration();
+path = Gener

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=301836&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-301836
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 27/Aug/19 09:46
Start Date: 27/Aug/19 09:46
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#discussion_r317980266
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/statemachine/background/BlockDeletingService.java
 ##
 @@ -143,6 +150,52 @@ public BackgroundTaskQueue getTasks() {
 return queue;
   }
 
+  public List chooseContainerForBlockDeletion(int count,
+  ContainerDeletionChoosingPolicy deletionPolicy)
+  throws StorageContainerException {
+Map containerDataMap =
+ozoneContainer.getContainerSet().getContainerMap().entrySet().stream()
+.filter(e -> isDeletionAllowed(e.getValue().getContainerData(),
+deletionPolicy)).collect(Collectors
+.toMap(Map.Entry::getKey, e -> e.getValue().getContainerData()));
+return deletionPolicy
+.chooseContainerForBlockDeletion(count, containerDataMap);
+  }
+
+  private boolean isDeletionAllowed(ContainerData containerData,
+  ContainerDeletionChoosingPolicy deletionPolicy) {
+if (!deletionPolicy
+.isValidContainerType(containerData.getContainerType())) {
+  return false;
+} else if (!containerData.isClosed()) {
+  return false;
+} else {
+  if (ozoneContainer.getWriteChannel() instanceof XceiverServerRatis) {
+try {
+  XceiverServerRatis ratisServer =
+  (XceiverServerRatis) ozoneContainer.getWriteChannel();
+  long minReplicatedIndex = 
ratisServer.getMinReplicatedIndex(PipelineID
+  .valueOf(UUID.fromString(containerData.getOriginPipelineId(;
 
 Review comment:
   If the pipeline does not exist, it should throw GroupMismatchException here. 
We can check what exception is thrown once. In that case we need to return true.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 301836)
Time Spent: 4h 20m  (was: 4h 10m)

> Datanode unable to find chunk while replication data using ratis.
> -
>
> Key: HDDS-1753
> URL: https://issues.apache.org/jira/browse/HDDS-1753
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: HDDS-1753.000.patch
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Leader datanode is unable to read chunk from the datanode while replicating 
> data from leader to follower.
> Please note that deletion of keys is also happening while the data is being 
> replicated.
> {code}
> 2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl 
> (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
> -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 1)
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler 
> (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace 
> ID: 4216d461a4679e17:4216

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=301834&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-301834
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 27/Aug/19 09:46
Start Date: 27/Aug/19 09:46
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#discussion_r317981032
 
 

 ##
 File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/container/common/TestBlockDeletingService.java
 ##
 @@ -195,8 +198,13 @@ public void testBlockDeletion() throws Exception {
 ContainerSet containerSet = new ContainerSet();
 createToDeleteBlocks(containerSet, conf, 1, 3, 1);
 
+OzoneContainer ozoneContainer = Mockito.mock(OzoneContainer.class);
+Mockito.when(ozoneContainer.getContainerSet())
+.thenReturn(containerSet);
+Mockito.when(ozoneContainer.getWriteChannel())
+.thenReturn(null);
 
 Review comment:
   Since it is used in all the test classes, we can add the logic for mocking 
ozoneContainer in a TestUtil class.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 301834)
Time Spent: 4h 10m  (was: 4h)

> Datanode unable to find chunk while replication data using ratis.
> -
>
> Key: HDDS-1753
> URL: https://issues.apache.org/jira/browse/HDDS-1753
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: HDDS-1753.000.patch
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Leader datanode is unable to read chunk from the datanode while replicating 
> data from leader to follower.
> Please note that deletion of keys is also happening while the data is being 
> replicated.
> {code}
> 2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl 
> (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
> -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 1)
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler 
> (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace 
> ID: 4216d461a4679e17:4216d461a4679e17:0:0 : Message: Unable to find the c
> hunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048} : Result: UNABLE_TO_FIND_CHUNK
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 2)
> 2019-07-02 19:39:22,606 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#72:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 19:39:22.606 [pool-195-thread-19] ERROR DNAudit - user=null | ip=null | 
> op=READ_CHUNK {blockData=conID: 3 locID: 102372189549953034 bcsId: 0} | 
> ret=FAILURE
> java.l

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=301807&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-301807
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 27/Aug/19 09:15
Start Date: 27/Aug/19 09:15
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#issuecomment-525215713
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 33 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 0 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 5 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 23 | Maven dependency ordering for branch |
   | +1 | mvninstall | 617 | trunk passed |
   | +1 | compile | 384 | trunk passed |
   | +1 | checkstyle | 78 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 981 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 187 | trunk passed |
   | 0 | spotbugs | 472 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | +1 | findbugs | 702 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 32 | Maven dependency ordering for patch |
   | +1 | mvninstall | 582 | the patch passed |
   | +1 | compile | 381 | the patch passed |
   | +1 | javac | 381 | the patch passed |
   | -0 | checkstyle | 40 | hadoop-hdds: The patch generated 8 new + 0 
unchanged - 0 fixed = 8 total (was 0) |
   | -0 | checkstyle | 45 | hadoop-ozone: The patch generated 2 new + 0 
unchanged - 0 fixed = 2 total (was 0) |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 757 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 173 | the patch passed |
   | +1 | findbugs | 743 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 363 | hadoop-hdds in the patch passed. |
   | -1 | unit | 2090 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 42 | The patch does not generate ASF License warnings. |
   | | | 8410 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.ozone.container.common.statemachine.commandhandler.TestBlockDeletion |
   |   | hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=19.03.1 Server=19.03.1 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1318 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 62af9001d773 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 
10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 3329257 |
   | Default Java | 1.8.0_222 |
   | checkstyle | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/5/artifact/out/diff-checkstyle-hadoop-hdds.txt
 |
   | checkstyle | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/5/artifact/out/diff-checkstyle-hadoop-ozone.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/5/artifact/out/patch-unit-hadoop-ozone.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/5/testReport/ |
   | Max. process+thread count | 4504 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/common hadoop-hdds/container-service 
hadoop-ozone/integration-test U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/5/console |
   | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 |
   | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 301807)
Time Spent: 4h  (was: 3h 50m)

> Datanode unable to find chunk while replication data using ratis.
> -
>
>

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=300688&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-300688
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 24/Aug/19 09:54
Start Date: 24/Aug/19 09:54
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#issuecomment-524537693
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 46 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 1 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 5 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 70 | Maven dependency ordering for branch |
   | +1 | mvninstall | 596 | trunk passed |
   | +1 | compile | 347 | trunk passed |
   | +1 | checkstyle | 62 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 802 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 149 | trunk passed |
   | 0 | spotbugs | 432 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | +1 | findbugs | 633 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 27 | Maven dependency ordering for patch |
   | +1 | mvninstall | 534 | the patch passed |
   | +1 | compile | 361 | the patch passed |
   | +1 | javac | 361 | the patch passed |
   | -0 | checkstyle | 31 | hadoop-hdds: The patch generated 8 new + 0 
unchanged - 0 fixed = 8 total (was 0) |
   | -0 | checkstyle | 38 | hadoop-ozone: The patch generated 2 new + 0 
unchanged - 0 fixed = 2 total (was 0) |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 638 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 155 | the patch passed |
   | +1 | findbugs | 663 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 290 | hadoop-hdds in the patch passed. |
   | -1 | unit | 2136 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 44 | The patch does not generate ASF License warnings. |
   | | | 7778 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.ozone.container.common.statemachine.commandhandler.TestBlockDeletion |
   |   | hadoop.ozone.client.rpc.TestReadRetries |
   |   | hadoop.ozone.container.server.TestSecureContainerServer |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=19.03.1 Server=19.03.1 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1318 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux a32307802f67 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / d2225c8 |
   | Default Java | 1.8.0_222 |
   | checkstyle | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/4/artifact/out/diff-checkstyle-hadoop-hdds.txt
 |
   | checkstyle | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/4/artifact/out/diff-checkstyle-hadoop-ozone.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/4/artifact/out/patch-unit-hadoop-ozone.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/4/testReport/ |
   | Max. process+thread count | 4503 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/common hadoop-hdds/container-service 
hadoop-ozone/integration-test U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/4/console |
   | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 |
   | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 300688)
Time Spent: 3h 50m  (was: 3h 40m)

> Datanode unable to find chunk while replication data using ratis.
> ---

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=300167&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-300167
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 23/Aug/19 08:57
Start Date: 23/Aug/19 08:57
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#issuecomment-524234061
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 45 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 0 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 5 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 21 | Maven dependency ordering for branch |
   | +1 | mvninstall | 588 | trunk passed |
   | +1 | compile | 359 | trunk passed |
   | +1 | checkstyle | 67 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 850 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 150 | trunk passed |
   | 0 | spotbugs | 419 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | +1 | findbugs | 606 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 30 | Maven dependency ordering for patch |
   | -1 | mvninstall | 292 | hadoop-ozone in the patch failed. |
   | -1 | compile | 231 | hadoop-ozone in the patch failed. |
   | -1 | javac | 231 | hadoop-ozone in the patch failed. |
   | -0 | checkstyle | 33 | hadoop-hdds: The patch generated 8 new + 0 
unchanged - 0 fixed = 8 total (was 0) |
   | -0 | checkstyle | 35 | hadoop-ozone: The patch generated 2 new + 0 
unchanged - 0 fixed = 2 total (was 0) |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 652 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 152 | the patch passed |
   | -1 | findbugs | 356 | hadoop-ozone in the patch failed. |
   ||| _ Other Tests _ |
   | +1 | unit | 290 | hadoop-hdds in the patch passed. |
   | -1 | unit | 324 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 38 | The patch does not generate ASF License warnings. |
   | | | 5676 | |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=19.03.1 Server=19.03.1 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1318 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 0afc6ecbfc32 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / bd7baea |
   | Default Java | 1.8.0_212 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/3/artifact/out/patch-mvninstall-hadoop-ozone.txt
 |
   | compile | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/3/artifact/out/patch-compile-hadoop-ozone.txt
 |
   | javac | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/3/artifact/out/patch-compile-hadoop-ozone.txt
 |
   | checkstyle | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/3/artifact/out/diff-checkstyle-hadoop-hdds.txt
 |
   | checkstyle | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/3/artifact/out/diff-checkstyle-hadoop-ozone.txt
 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/3/artifact/out/patch-findbugs-hadoop-ozone.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/3/artifact/out/patch-unit-hadoop-ozone.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/3/testReport/ |
   | Max. process+thread count | 1345 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/common hadoop-hdds/container-service 
hadoop-ozone/integration-test U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/3/console |
   | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 |
   | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For q

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=300108&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-300108
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 23/Aug/19 07:21
Start Date: 23/Aug/19 07:21
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 300108)
Time Spent: 3.5h  (was: 3h 20m)

> Datanode unable to find chunk while replication data using ratis.
> -
>
> Key: HDDS-1753
> URL: https://issues.apache.org/jira/browse/HDDS-1753
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: HDDS-1753.000.patch
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Leader datanode is unable to read chunk from the datanode while replicating 
> data from leader to follower.
> Please note that deletion of keys is also happening while the data is being 
> replicated.
> {code}
> 2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl 
> (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
> -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 1)
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler 
> (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace 
> ID: 4216d461a4679e17:4216d461a4679e17:0:0 : Message: Unable to find the c
> hunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048} : Result: UNABLE_TO_FIND_CHUNK
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 2)
> 2019-07-02 19:39:22,606 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#72:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 19:39:22.606 [pool-195-thread-19] ERROR DNAudit - user=null | ip=null | 
> op=READ_CHUNK {blockData=conID: 3 locID: 102372189549953034 bcsId: 0} | 
> ret=FAILURE
> java.lang.Exception: Unable to find the chunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048}
> at 
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:320)
>  ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:148)
>  ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCommand(ContainerStateMachine.java:346)
>  ~[hadoop-hdds-container-service-0.5.0-SNAPSH

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=300106&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-300106
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 23/Aug/19 07:13
Start Date: 23/Aug/19 07:13
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 300106)
Time Spent: 3h 20m  (was: 3h 10m)

> Datanode unable to find chunk while replication data using ratis.
> -
>
> Key: HDDS-1753
> URL: https://issues.apache.org/jira/browse/HDDS-1753
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: HDDS-1753.000.patch
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Leader datanode is unable to read chunk from the datanode while replicating 
> data from leader to follower.
> Please note that deletion of keys is also happening while the data is being 
> replicated.
> {code}
> 2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl 
> (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
> -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 1)
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler 
> (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace 
> ID: 4216d461a4679e17:4216d461a4679e17:0:0 : Message: Unable to find the c
> hunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048} : Result: UNABLE_TO_FIND_CHUNK
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 2)
> 2019-07-02 19:39:22,606 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#72:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 19:39:22.606 [pool-195-thread-19] ERROR DNAudit - user=null | ip=null | 
> op=READ_CHUNK {blockData=conID: 3 locID: 102372189549953034 bcsId: 0} | 
> ret=FAILURE
> java.lang.Exception: Unable to find the chunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048}
> at 
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:320)
>  ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:148)
>  ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCommand(ContainerStateMachine.java:346)
>  ~[hadoop-hdds-container-service-0.5.0-SN

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=300100&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-300100
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 23/Aug/19 06:52
Start Date: 23/Aug/19 06:52
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#discussion_r316995331
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/ContainerSet.java
 ##
 @@ -24,20 +24,21 @@
 import org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos;
 import org.apache.hadoop.hdds.protocol.proto
 .StorageContainerDatanodeProtocolProtos.ContainerReportsProto;
+import org.apache.hadoop.hdds.scm.XceiverClientSpi;
 
 Review comment:
   addressed
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 300100)
Time Spent: 3h 10m  (was: 3h)

> Datanode unable to find chunk while replication data using ratis.
> -
>
> Key: HDDS-1753
> URL: https://issues.apache.org/jira/browse/HDDS-1753
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: HDDS-1753.000.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Leader datanode is unable to read chunk from the datanode while replicating 
> data from leader to follower.
> Please note that deletion of keys is also happening while the data is being 
> replicated.
> {code}
> 2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl 
> (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
> -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 1)
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler 
> (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace 
> ID: 4216d461a4679e17:4216d461a4679e17:0:0 : Message: Unable to find the c
> hunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048} : Result: UNABLE_TO_FIND_CHUNK
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 2)
> 2019-07-02 19:39:22,606 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#72:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 19:39:22.606 [pool-195-thread-19] ERROR DNAudit - user=null | ip=null | 
> op=READ_CHUNK {blockData=conID: 3 locID: 102372189549953034 bcsId: 0} | 
> ret=FAILURE
> java.lang.Exception: Unable to find the chunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048}
> at 
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatc

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=300098&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-300098
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 23/Aug/19 06:52
Start Date: 23/Aug/19 06:52
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#discussion_r316995218
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/ContainerSet.java
 ##
 @@ -55,6 +56,13 @@
   ConcurrentSkipListMap<>();
   private final ConcurrentSkipListSet missingContainerSet =
   new ConcurrentSkipListSet<>();
+
+  private XceiverServerSpi writeChannel = null;
 
 Review comment:
   addressed
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 300098)
Time Spent: 2h 50m  (was: 2h 40m)

> Datanode unable to find chunk while replication data using ratis.
> -
>
> Key: HDDS-1753
> URL: https://issues.apache.org/jira/browse/HDDS-1753
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: HDDS-1753.000.patch
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Leader datanode is unable to read chunk from the datanode while replicating 
> data from leader to follower.
> Please note that deletion of keys is also happening while the data is being 
> replicated.
> {code}
> 2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl 
> (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
> -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 1)
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler 
> (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace 
> ID: 4216d461a4679e17:4216d461a4679e17:0:0 : Message: Unable to find the c
> hunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048} : Result: UNABLE_TO_FIND_CHUNK
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 2)
> 2019-07-02 19:39:22,606 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#72:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 19:39:22.606 [pool-195-thread-19] ERROR DNAudit - user=null | ip=null | 
> op=READ_CHUNK {blockData=conID: 3 locID: 102372189549953034 bcsId: 0} | 
> ret=FAILURE
> java.lang.Exception: Unable to find the chunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048}
> at 
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:320)
>  ~[hadoop-hdds-c

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=300099&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-300099
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 23/Aug/19 06:52
Start Date: 23/Aug/19 06:52
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#discussion_r316995273
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/ContainerSet.java
 ##
 @@ -24,20 +24,21 @@
 import org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos;
 import org.apache.hadoop.hdds.protocol.proto
 .StorageContainerDatanodeProtocolProtos.ContainerReportsProto;
+import org.apache.hadoop.hdds.scm.XceiverClientSpi;
 import org.apache.hadoop.hdds.scm.container.common.helpers
 .StorageContainerException;
+import org.apache.hadoop.hdds.scm.pipeline.PipelineID;
 import org.apache.hadoop.ozone.container.common.interfaces.Container;
 import org.apache.hadoop.ozone.container.common
 .interfaces.ContainerDeletionChoosingPolicy;
+import 
org.apache.hadoop.ozone.container.common.transport.server.XceiverServerSpi;
+import 
org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis;
 import org.apache.hadoop.ozone.container.common.volume.HddsVolume;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
 import java.io.IOException;
-import java.util.Iterator;
-import java.util.List;
-import java.util.Set;
-import java.util.Map;
+import java.util.*;
 
 Review comment:
   addressed
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 300099)
Time Spent: 3h  (was: 2h 50m)

> Datanode unable to find chunk while replication data using ratis.
> -
>
> Key: HDDS-1753
> URL: https://issues.apache.org/jira/browse/HDDS-1753
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: HDDS-1753.000.patch
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Leader datanode is unable to read chunk from the datanode while replicating 
> data from leader to follower.
> Please note that deletion of keys is also happening while the data is being 
> replicated.
> {code}
> 2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl 
> (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
> -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 1)
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler 
> (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace 
> ID: 4216d461a4679e17:4216d461a4679e17:0:0 : Message: Unable to find the c
> hunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048} : Result: UNABLE_TO_FIND_CHUNK
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 2)
> 2019

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=300097&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-300097
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 23/Aug/19 06:52
Start Date: 23/Aug/19 06:52
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#discussion_r316995081
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/XceiverServerRatis.java
 ##
 @@ -50,14 +46,7 @@
 import org.apache.ratis.grpc.GrpcFactory;
 import org.apache.ratis.grpc.GrpcTlsConfig;
 import org.apache.ratis.netty.NettyConfigKeys;
-import org.apache.ratis.protocol.RaftClientRequest;
-import org.apache.ratis.protocol.Message;
-import org.apache.ratis.protocol.RaftClientReply;
-import org.apache.ratis.protocol.ClientId;
-import org.apache.ratis.protocol.NotLeaderException;
-import org.apache.ratis.protocol.StateMachineException;
-import org.apache.ratis.protocol.RaftPeerId;
-import org.apache.ratis.protocol.RaftGroupId;
+import org.apache.ratis.protocol.*;
 
 Review comment:
   addressed
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 300097)
Time Spent: 2h 40m  (was: 2.5h)

> Datanode unable to find chunk while replication data using ratis.
> -
>
> Key: HDDS-1753
> URL: https://issues.apache.org/jira/browse/HDDS-1753
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: HDDS-1753.000.patch
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Leader datanode is unable to read chunk from the datanode while replicating 
> data from leader to follower.
> Please note that deletion of keys is also happening while the data is being 
> replicated.
> {code}
> 2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl 
> (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
> -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 1)
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler 
> (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace 
> ID: 4216d461a4679e17:4216d461a4679e17:0:0 : Message: Unable to find the c
> hunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048} : Result: UNABLE_TO_FIND_CHUNK
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 2)
> 2019-07-02 19:39:22,606 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#72:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 19:39:22.606 [pool-195-thread-19] ERROR DNAudit - user=null | ip=null | 
> op=READ_CHUNK {

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=300096&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-300096
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 23/Aug/19 06:51
Start Date: 23/Aug/19 06:51
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#discussion_r316995056
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/XceiverServerRatis.java
 ##
 @@ -654,4 +646,18 @@ public void handleNodeLogFailure(RaftGroupId groupId, 
Throwable t) {
 triggerPipelineClose(groupId, msg,
 ClosePipelineInfo.Reason.PIPELINE_LOG_FAILED, true);
   }
+
+  public long getMinReplicatedIndex(PipelineID pipelineID) throws IOException {
+Long minIndex = null;
+Iterator raftGroupIterator = getServer().getGroups().iterator();
+while (raftGroupIterator.hasNext()) {
 
 Review comment:
   addressed.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 300096)
Time Spent: 2.5h  (was: 2h 20m)

> Datanode unable to find chunk while replication data using ratis.
> -
>
> Key: HDDS-1753
> URL: https://issues.apache.org/jira/browse/HDDS-1753
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: HDDS-1753.000.patch
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Leader datanode is unable to read chunk from the datanode while replicating 
> data from leader to follower.
> Please note that deletion of keys is also happening while the data is being 
> replicated.
> {code}
> 2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl 
> (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
> -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 1)
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler 
> (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace 
> ID: 4216d461a4679e17:4216d461a4679e17:0:0 : Message: Unable to find the c
> hunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048} : Result: UNABLE_TO_FIND_CHUNK
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 2)
> 2019-07-02 19:39:22,606 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#72:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 19:39:22.606 [pool-195-thread-19] ERROR DNAudit - user=null | ip=null | 
> op=READ_CHUNK {blockData=conID: 3 locID: 102372189549953034 bcsId: 0} | 
> ret=FAILURE
> java.lang.Exception: Unable to find the chunk file. chunk info 
> ChunkInfo{chunkName='76ec66

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=300095&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-300095
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 23/Aug/19 06:51
Start Date: 23/Aug/19 06:51
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#discussion_r316995006
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/ContainerSet.java
 ##
 @@ -236,14 +244,43 @@ public ContainerReportsProto getContainerReport() throws 
IOException {
   ContainerDeletionChoosingPolicy deletionPolicy)
   throws StorageContainerException {
 Map containerDataMap = 
containerMap.entrySet().stream()
-.filter(e -> deletionPolicy.isValidContainerType(
-e.getValue().getContainerType()))
-.collect(Collectors.toMap(Map.Entry::getKey,
-e -> e.getValue().getContainerData()));
+.filter(e ->
+
deletionPolicy.isValidContainerType(e.getValue().getContainerType())
+&& isDeletionAllowed(e.getValue().getContainerData())).collect(
+Collectors.toMap(Map.Entry::getKey,
+e -> e.getValue().getContainerData()));
 return deletionPolicy
 .chooseContainerForBlockDeletion(count, containerDataMap);
   }
 
+  private boolean isDeletionAllowed(ContainerData containerData) {
+if (containerData.isClosed()) {
+  if (writeChannel instanceof XceiverServerRatis) {
+try {
+  XceiverServerRatis ratisServer = (XceiverServerRatis) writeChannel;
+  long minReplicatedIndex = 
ratisServer.getMinReplicatedIndex(PipelineID
+  .valueOf(UUID.fromString(containerData.getOriginPipelineId(;
+  long containerBCSID = containerData.getBlockCommitSequenceId();
+  if (minReplicatedIndex != 0 && minReplicatedIndex < containerBCSID) {
 
 Review comment:
   Will update the code to return -1 in case getMinReplicatedIndex call fail.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 300095)
Time Spent: 2h 20m  (was: 2h 10m)

> Datanode unable to find chunk while replication data using ratis.
> -
>
> Key: HDDS-1753
> URL: https://issues.apache.org/jira/browse/HDDS-1753
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: HDDS-1753.000.patch
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Leader datanode is unable to read chunk from the datanode while replicating 
> data from leader to follower.
> Please note that deletion of keys is also happening while the data is being 
> replicated.
> {code}
> 2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl 
> (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
> -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 1)
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler 
> (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace 
> ID: 4216d461a4679e17:4216d461a4679e17:0:0 : Messag

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=300093&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-300093
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 23/Aug/19 06:50
Start Date: 23/Aug/19 06:50
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#discussion_r316994420
 
 

 ##
 File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestDeleteWithSlowFollower.java
 ##
 @@ -0,0 +1,274 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations 
under
+ * the License.
+ */
+
+package org.apache.hadoop.ozone.client.rpc;
+
+import org.apache.hadoop.hdds.client.BlockID;
+import org.apache.hadoop.hdds.client.ReplicationFactor;
+import org.apache.hadoop.hdds.client.ReplicationType;
+import org.apache.hadoop.hdds.conf.OzoneConfiguration;
+import org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos;
+import org.apache.hadoop.hdds.protocol.proto.HddsProtos;
+import org.apache.hadoop.hdds.scm.ScmConfigKeys;
+import org.apache.hadoop.hdds.scm.XceiverClientManager;
+import org.apache.hadoop.hdds.scm.XceiverClientSpi;
+import 
org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException;
+import org.apache.hadoop.hdds.scm.pipeline.Pipeline;
+import org.apache.hadoop.ozone.HddsDatanodeService;
+import org.apache.hadoop.ozone.MiniOzoneCluster;
+import org.apache.hadoop.ozone.OzoneConfigKeys;
+import org.apache.hadoop.ozone.client.ObjectStore;
+import org.apache.hadoop.ozone.client.OzoneClient;
+import org.apache.hadoop.ozone.client.OzoneClientFactory;
+import org.apache.hadoop.ozone.client.io.KeyOutputStream;
+import org.apache.hadoop.ozone.client.io.OzoneOutputStream;
+import org.apache.hadoop.ozone.container.ContainerTestHelper;
+import org.apache.hadoop.ozone.container.common.helpers.BlockData;
+import org.apache.hadoop.ozone.container.common.helpers.ChunkInfo;
+import org.apache.hadoop.ozone.container.common.interfaces.Container;
+import 
org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine;
+import 
org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine;
+import org.apache.hadoop.ozone.container.keyvalue.KeyValueContainerData;
+import org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler;
+import org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer;
+import org.apache.hadoop.ozone.om.helpers.OmKeyArgs;
+import org.apache.hadoop.ozone.om.helpers.OmKeyInfo;
+import org.apache.hadoop.ozone.om.helpers.OmKeyLocationInfo;
+import org.apache.hadoop.test.GenericTestUtils;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.HashMap;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+
+import static 
org.apache.hadoop.hdds.HddsConfigKeys.HDDS_COMMAND_STATUS_REPORT_INTERVAL;
+import static 
org.apache.hadoop.hdds.HddsConfigKeys.HDDS_CONTAINER_REPORT_INTERVAL;
+import static 
org.apache.hadoop.hdds.scm.ScmConfigKeys.HDDS_SCM_WATCHER_TIMEOUT;
+import static 
org.apache.hadoop.hdds.scm.ScmConfigKeys.OZONE_SCM_PIPELINE_DESTROY_TIMEOUT;
+import static 
org.apache.hadoop.hdds.scm.ScmConfigKeys.OZONE_SCM_STALENODE_INTERVAL;
+
+/**
+ * Tests delete key operation with a slow follower in the datanode
+ * pipeline.
+ */
+public class TestDeleteWithSlowFollower {
+
+  private static MiniOzoneCluster cluster;
+  private static OzoneConfiguration conf;
+  private static OzoneClient client;
+  private static ObjectStore objectStore;
+  private static String volumeName;
+  private static String bucketName;
+  private static String path;
+  private static XceiverClientManager xceiverClientManager;
+
+  /**
+   * Create a MiniDFSCluster for testing.
+   *
+   * @throws IOException
+   */
+  @BeforeClass
+  public static void init() throws Exception {
+conf = new OzoneConfiguration();
+path = Gener

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=300094&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-300094
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 23/Aug/19 06:50
Start Date: 23/Aug/19 06:50
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#discussion_r316994794
 
 

 ##
 File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestDeleteWithSlowFollower.java
 ##
 @@ -0,0 +1,274 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations 
under
+ * the License.
+ */
+
+package org.apache.hadoop.ozone.client.rpc;
+
+import org.apache.hadoop.hdds.client.BlockID;
+import org.apache.hadoop.hdds.client.ReplicationFactor;
+import org.apache.hadoop.hdds.client.ReplicationType;
+import org.apache.hadoop.hdds.conf.OzoneConfiguration;
+import org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos;
+import org.apache.hadoop.hdds.protocol.proto.HddsProtos;
+import org.apache.hadoop.hdds.scm.ScmConfigKeys;
+import org.apache.hadoop.hdds.scm.XceiverClientManager;
+import org.apache.hadoop.hdds.scm.XceiverClientSpi;
+import 
org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException;
+import org.apache.hadoop.hdds.scm.pipeline.Pipeline;
+import org.apache.hadoop.ozone.HddsDatanodeService;
+import org.apache.hadoop.ozone.MiniOzoneCluster;
+import org.apache.hadoop.ozone.OzoneConfigKeys;
+import org.apache.hadoop.ozone.client.ObjectStore;
+import org.apache.hadoop.ozone.client.OzoneClient;
+import org.apache.hadoop.ozone.client.OzoneClientFactory;
+import org.apache.hadoop.ozone.client.io.KeyOutputStream;
+import org.apache.hadoop.ozone.client.io.OzoneOutputStream;
+import org.apache.hadoop.ozone.container.ContainerTestHelper;
+import org.apache.hadoop.ozone.container.common.helpers.BlockData;
+import org.apache.hadoop.ozone.container.common.helpers.ChunkInfo;
+import org.apache.hadoop.ozone.container.common.interfaces.Container;
+import 
org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine;
+import 
org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine;
+import org.apache.hadoop.ozone.container.keyvalue.KeyValueContainerData;
+import org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler;
+import org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer;
+import org.apache.hadoop.ozone.om.helpers.OmKeyArgs;
+import org.apache.hadoop.ozone.om.helpers.OmKeyInfo;
+import org.apache.hadoop.ozone.om.helpers.OmKeyLocationInfo;
+import org.apache.hadoop.test.GenericTestUtils;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.HashMap;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+
+import static 
org.apache.hadoop.hdds.HddsConfigKeys.HDDS_COMMAND_STATUS_REPORT_INTERVAL;
+import static 
org.apache.hadoop.hdds.HddsConfigKeys.HDDS_CONTAINER_REPORT_INTERVAL;
+import static 
org.apache.hadoop.hdds.scm.ScmConfigKeys.HDDS_SCM_WATCHER_TIMEOUT;
+import static 
org.apache.hadoop.hdds.scm.ScmConfigKeys.OZONE_SCM_PIPELINE_DESTROY_TIMEOUT;
+import static 
org.apache.hadoop.hdds.scm.ScmConfigKeys.OZONE_SCM_STALENODE_INTERVAL;
+
+/**
+ * Tests delete key operation with a slow follower in the datanode
+ * pipeline.
+ */
+public class TestDeleteWithSlowFollower {
+
+  private static MiniOzoneCluster cluster;
+  private static OzoneConfiguration conf;
+  private static OzoneClient client;
+  private static ObjectStore objectStore;
+  private static String volumeName;
+  private static String bucketName;
+  private static String path;
+  private static XceiverClientManager xceiverClientManager;
+
+  /**
+   * Create a MiniDFSCluster for testing.
+   *
+   * @throws IOException
+   */
+  @BeforeClass
+  public static void init() throws Exception {
+conf = new OzoneConfiguration();
+path = Gener

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=300091&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-300091
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 23/Aug/19 06:49
Start Date: 23/Aug/19 06:49
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#discussion_r316994420
 
 

 ##
 File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestDeleteWithSlowFollower.java
 ##
 @@ -0,0 +1,274 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations 
under
+ * the License.
+ */
+
+package org.apache.hadoop.ozone.client.rpc;
+
+import org.apache.hadoop.hdds.client.BlockID;
+import org.apache.hadoop.hdds.client.ReplicationFactor;
+import org.apache.hadoop.hdds.client.ReplicationType;
+import org.apache.hadoop.hdds.conf.OzoneConfiguration;
+import org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos;
+import org.apache.hadoop.hdds.protocol.proto.HddsProtos;
+import org.apache.hadoop.hdds.scm.ScmConfigKeys;
+import org.apache.hadoop.hdds.scm.XceiverClientManager;
+import org.apache.hadoop.hdds.scm.XceiverClientSpi;
+import 
org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException;
+import org.apache.hadoop.hdds.scm.pipeline.Pipeline;
+import org.apache.hadoop.ozone.HddsDatanodeService;
+import org.apache.hadoop.ozone.MiniOzoneCluster;
+import org.apache.hadoop.ozone.OzoneConfigKeys;
+import org.apache.hadoop.ozone.client.ObjectStore;
+import org.apache.hadoop.ozone.client.OzoneClient;
+import org.apache.hadoop.ozone.client.OzoneClientFactory;
+import org.apache.hadoop.ozone.client.io.KeyOutputStream;
+import org.apache.hadoop.ozone.client.io.OzoneOutputStream;
+import org.apache.hadoop.ozone.container.ContainerTestHelper;
+import org.apache.hadoop.ozone.container.common.helpers.BlockData;
+import org.apache.hadoop.ozone.container.common.helpers.ChunkInfo;
+import org.apache.hadoop.ozone.container.common.interfaces.Container;
+import 
org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine;
+import 
org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine;
+import org.apache.hadoop.ozone.container.keyvalue.KeyValueContainerData;
+import org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler;
+import org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer;
+import org.apache.hadoop.ozone.om.helpers.OmKeyArgs;
+import org.apache.hadoop.ozone.om.helpers.OmKeyInfo;
+import org.apache.hadoop.ozone.om.helpers.OmKeyLocationInfo;
+import org.apache.hadoop.test.GenericTestUtils;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.HashMap;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+
+import static 
org.apache.hadoop.hdds.HddsConfigKeys.HDDS_COMMAND_STATUS_REPORT_INTERVAL;
+import static 
org.apache.hadoop.hdds.HddsConfigKeys.HDDS_CONTAINER_REPORT_INTERVAL;
+import static 
org.apache.hadoop.hdds.scm.ScmConfigKeys.HDDS_SCM_WATCHER_TIMEOUT;
+import static 
org.apache.hadoop.hdds.scm.ScmConfigKeys.OZONE_SCM_PIPELINE_DESTROY_TIMEOUT;
+import static 
org.apache.hadoop.hdds.scm.ScmConfigKeys.OZONE_SCM_STALENODE_INTERVAL;
+
+/**
+ * Tests delete key operation with a slow follower in the datanode
+ * pipeline.
+ */
+public class TestDeleteWithSlowFollower {
+
+  private static MiniOzoneCluster cluster;
+  private static OzoneConfiguration conf;
+  private static OzoneClient client;
+  private static ObjectStore objectStore;
+  private static String volumeName;
+  private static String bucketName;
+  private static String path;
+  private static XceiverClientManager xceiverClientManager;
+
+  /**
+   * Create a MiniDFSCluster for testing.
+   *
+   * @throws IOException
+   */
+  @BeforeClass
+  public static void init() throws Exception {
+conf = new OzoneConfiguration();
+path = Gener

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=299411&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-299411
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 22/Aug/19 12:49
Start Date: 22/Aug/19 12:49
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#issuecomment-523891282
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 43 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 0 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 2 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 64 | Maven dependency ordering for branch |
   | +1 | mvninstall | 636 | trunk passed |
   | +1 | compile | 393 | trunk passed |
   | +1 | checkstyle | 67 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 848 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 161 | trunk passed |
   | 0 | spotbugs | 476 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | +1 | findbugs | 697 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 33 | Maven dependency ordering for patch |
   | +1 | mvninstall | 595 | the patch passed |
   | +1 | compile | 389 | the patch passed |
   | +1 | javac | 389 | the patch passed |
   | -0 | checkstyle | 34 | hadoop-hdds: The patch generated 4 new + 0 
unchanged - 0 fixed = 4 total (was 0) |
   | -0 | checkstyle | 39 | hadoop-ozone: The patch generated 3 new + 0 
unchanged - 0 fixed = 3 total (was 0) |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 648 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 164 | the patch passed |
   | +1 | findbugs | 706 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 289 | hadoop-hdds in the patch passed. |
   | -1 | unit | 2390 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 35 | The patch does not generate ASF License warnings. |
   | | | 8372 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.ozone.container.common.statemachine.commandhandler.TestCloseContainerByPipeline
 |
   |   | 
hadoop.ozone.container.common.statemachine.commandhandler.TestBlockDeletion |
   |   | hadoop.ozone.client.rpc.TestContainerStateMachineFailures |
   |   | hadoop.ozone.client.rpc.TestWatchForCommit |
   |   | hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures |
   |   | hadoop.ozone.container.server.TestSecureContainerServer |
   |   | hadoop.hdds.scm.pipeline.TestRatisPipelineProvider |
   |   | hadoop.ozone.client.rpc.Test2WayCommitInRatis |
   |   | hadoop.ozone.client.rpc.TestReadRetries |
   |   | hadoop.ozone.client.rpc.TestBlockOutputStream |
   |   | hadoop.ozone.client.rpc.TestSecureOzoneRpcClient |
   |   | hadoop.ozone.client.rpc.TestOzoneRpcClientForAclAuditLog |
   |   | hadoop.ozone.client.rpc.TestBCSID |
   |   | hadoop.ozone.client.rpc.TestContainerStateMachine |
   |   | hadoop.ozone.client.rpc.TestOzoneAtRestEncryption |
   |   | hadoop.ozone.client.rpc.TestHybridPipelineOnDatanode |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=19.03.1 Server=19.03.1 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1318 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 125b9977f71d 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / ee7c261 |
   | Default Java | 1.8.0_222 |
   | checkstyle | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/2/artifact/out/diff-checkstyle-hadoop-hdds.txt
 |
   | checkstyle | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/2/artifact/out/diff-checkstyle-hadoop-ozone.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/2/artifact/out/patch-unit-hadoop-ozone.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/2/testReport/ |
   | Max. process+thread count | 5406 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/common hadoop-hdds/container-service 
hadoop-ozone/integration-tes

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=298735&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-298735
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 21/Aug/19 14:37
Start Date: 21/Aug/19 14:37
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#discussion_r316221207
 
 

 ##
 File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestDeleteWithSlowFollower.java
 ##
 @@ -0,0 +1,274 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations 
under
+ * the License.
+ */
+
+package org.apache.hadoop.ozone.client.rpc;
+
+import org.apache.hadoop.hdds.client.BlockID;
+import org.apache.hadoop.hdds.client.ReplicationFactor;
+import org.apache.hadoop.hdds.client.ReplicationType;
+import org.apache.hadoop.hdds.conf.OzoneConfiguration;
+import org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos;
+import org.apache.hadoop.hdds.protocol.proto.HddsProtos;
+import org.apache.hadoop.hdds.scm.ScmConfigKeys;
+import org.apache.hadoop.hdds.scm.XceiverClientManager;
+import org.apache.hadoop.hdds.scm.XceiverClientSpi;
+import 
org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException;
+import org.apache.hadoop.hdds.scm.pipeline.Pipeline;
+import org.apache.hadoop.ozone.HddsDatanodeService;
+import org.apache.hadoop.ozone.MiniOzoneCluster;
+import org.apache.hadoop.ozone.OzoneConfigKeys;
+import org.apache.hadoop.ozone.client.ObjectStore;
+import org.apache.hadoop.ozone.client.OzoneClient;
+import org.apache.hadoop.ozone.client.OzoneClientFactory;
+import org.apache.hadoop.ozone.client.io.KeyOutputStream;
+import org.apache.hadoop.ozone.client.io.OzoneOutputStream;
+import org.apache.hadoop.ozone.container.ContainerTestHelper;
+import org.apache.hadoop.ozone.container.common.helpers.BlockData;
+import org.apache.hadoop.ozone.container.common.helpers.ChunkInfo;
+import org.apache.hadoop.ozone.container.common.interfaces.Container;
+import 
org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine;
+import 
org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine;
+import org.apache.hadoop.ozone.container.keyvalue.KeyValueContainerData;
+import org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler;
+import org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer;
+import org.apache.hadoop.ozone.om.helpers.OmKeyArgs;
+import org.apache.hadoop.ozone.om.helpers.OmKeyInfo;
+import org.apache.hadoop.ozone.om.helpers.OmKeyLocationInfo;
+import org.apache.hadoop.test.GenericTestUtils;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.HashMap;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+
+import static 
org.apache.hadoop.hdds.HddsConfigKeys.HDDS_COMMAND_STATUS_REPORT_INTERVAL;
+import static 
org.apache.hadoop.hdds.HddsConfigKeys.HDDS_CONTAINER_REPORT_INTERVAL;
+import static 
org.apache.hadoop.hdds.scm.ScmConfigKeys.HDDS_SCM_WATCHER_TIMEOUT;
+import static 
org.apache.hadoop.hdds.scm.ScmConfigKeys.OZONE_SCM_PIPELINE_DESTROY_TIMEOUT;
+import static 
org.apache.hadoop.hdds.scm.ScmConfigKeys.OZONE_SCM_STALENODE_INTERVAL;
+
+/**
+ * Tests delete key operation with a slow follower in the datanode
+ * pipeline.
+ */
+public class TestDeleteWithSlowFollower {
+
+  private static MiniOzoneCluster cluster;
+  private static OzoneConfiguration conf;
+  private static OzoneClient client;
+  private static ObjectStore objectStore;
+  private static String volumeName;
+  private static String bucketName;
+  private static String path;
+  private static XceiverClientManager xceiverClientManager;
+
+  /**
+   * Create a MiniDFSCluster for testing.
+   *
+   * @throws IOException
+   */
+  @BeforeClass
+  public static void init() throws Exception {
+conf = new OzoneConfiguration();
+path = Gener

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=298729&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-298729
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 21/Aug/19 14:37
Start Date: 21/Aug/19 14:37
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#discussion_r316030192
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/ContainerSet.java
 ##
 @@ -236,14 +244,43 @@ public ContainerReportsProto getContainerReport() throws 
IOException {
   ContainerDeletionChoosingPolicy deletionPolicy)
   throws StorageContainerException {
 Map containerDataMap = 
containerMap.entrySet().stream()
-.filter(e -> deletionPolicy.isValidContainerType(
-e.getValue().getContainerType()))
-.collect(Collectors.toMap(Map.Entry::getKey,
-e -> e.getValue().getContainerData()));
+.filter(e ->
+
deletionPolicy.isValidContainerType(e.getValue().getContainerType())
+&& isDeletionAllowed(e.getValue().getContainerData())).collect(
+Collectors.toMap(Map.Entry::getKey,
+e -> e.getValue().getContainerData()));
 return deletionPolicy
 .chooseContainerForBlockDeletion(count, containerDataMap);
   }
 
+  private boolean isDeletionAllowed(ContainerData containerData) {
+if (containerData.isClosed()) {
+  if (writeChannel instanceof XceiverServerRatis) {
+try {
+  XceiverServerRatis ratisServer = (XceiverServerRatis) writeChannel;
+  long minReplicatedIndex = 
ratisServer.getMinReplicatedIndex(PipelineID
+  .valueOf(UUID.fromString(containerData.getOriginPipelineId(;
+  long containerBCSID = containerData.getBlockCommitSequenceId();
+  if (minReplicatedIndex != 0 && minReplicatedIndex < containerBCSID) {
+LOG.info("Close Container lo Index {} is not replicated across all"
++ "the servers in the pipeline {} as the min replicated "
++ "index is {}. Deletion is not allowed in this container "
++ "yet.", containerBCSID, 
containerData.getOriginPipelineId(),
+minReplicatedIndex);
+return false;
+  } else {
+return true;
+  }
+} catch (IOException ioe) {
+  LOG.info(ioe.getMessage());
 
 Review comment:
   This should be LOG.error?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 298729)
Time Spent: 1h  (was: 50m)

> Datanode unable to find chunk while replication data using ratis.
> -
>
> Key: HDDS-1753
> URL: https://issues.apache.org/jira/browse/HDDS-1753
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: HDDS-1753.000.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Leader datanode is unable to read chunk from the datanode while replicating 
> data from leader to follower.
> Please note that deletion of keys is also happening while the data is being 
> replicated.
> {code}
> 2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl 
> (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
> -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 1)
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=298734&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-298734
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 21/Aug/19 14:37
Start Date: 21/Aug/19 14:37
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#discussion_r316173345
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/XceiverServerRatis.java
 ##
 @@ -22,15 +22,11 @@
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.conf.StorageUnit;
 import org.apache.hadoop.hdds.protocol.DatanodeDetails;
-import org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos
-.ContainerCommandRequestProto;
+import 
org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos.ContainerCommandRequestProto;
 
 Review comment:
   This would cause checkstyle issue.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 298734)
Time Spent: 1h 20m  (was: 1h 10m)

> Datanode unable to find chunk while replication data using ratis.
> -
>
> Key: HDDS-1753
> URL: https://issues.apache.org/jira/browse/HDDS-1753
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: HDDS-1753.000.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Leader datanode is unable to read chunk from the datanode while replicating 
> data from leader to follower.
> Please note that deletion of keys is also happening while the data is being 
> replicated.
> {code}
> 2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl 
> (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
> -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 1)
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler 
> (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace 
> ID: 4216d461a4679e17:4216d461a4679e17:0:0 : Message: Unable to find the c
> hunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048} : Result: UNABLE_TO_FIND_CHUNK
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 2)
> 2019-07-02 19:39:22,606 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#72:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 19:39:22.606 [pool-195-thread-19] ERROR DNAudit - user=null | ip=null | 
> op=READ_CHUNK {blockData=conID: 3 locID: 102372189549953034 bcsId: 0} | 
> ret=FAILURE
> java.lang.Exception: Unable to find the chunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=298730&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-298730
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 21/Aug/19 14:37
Start Date: 21/Aug/19 14:37
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#discussion_r316029714
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/ContainerSet.java
 ##
 @@ -236,14 +244,43 @@ public ContainerReportsProto getContainerReport() throws 
IOException {
   ContainerDeletionChoosingPolicy deletionPolicy)
   throws StorageContainerException {
 Map containerDataMap = 
containerMap.entrySet().stream()
-.filter(e -> deletionPolicy.isValidContainerType(
-e.getValue().getContainerType()))
-.collect(Collectors.toMap(Map.Entry::getKey,
-e -> e.getValue().getContainerData()));
+.filter(e ->
+
deletionPolicy.isValidContainerType(e.getValue().getContainerType())
+&& isDeletionAllowed(e.getValue().getContainerData())).collect(
+Collectors.toMap(Map.Entry::getKey,
+e -> e.getValue().getContainerData()));
 return deletionPolicy
 .chooseContainerForBlockDeletion(count, containerDataMap);
   }
 
+  private boolean isDeletionAllowed(ContainerData containerData) {
+if (containerData.isClosed()) {
+  if (writeChannel instanceof XceiverServerRatis) {
+try {
+  XceiverServerRatis ratisServer = (XceiverServerRatis) writeChannel;
+  long minReplicatedIndex = 
ratisServer.getMinReplicatedIndex(PipelineID
+  .valueOf(UUID.fromString(containerData.getOriginPipelineId(;
+  long containerBCSID = containerData.getBlockCommitSequenceId();
+  if (minReplicatedIndex != 0 && minReplicatedIndex < containerBCSID) {
+LOG.info("Close Container lo Index {} is not replicated across all"
 
 Review comment:
   "Close Container lo Index" - What does lo stand for?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 298730)
Time Spent: 1h  (was: 50m)

> Datanode unable to find chunk while replication data using ratis.
> -
>
> Key: HDDS-1753
> URL: https://issues.apache.org/jira/browse/HDDS-1753
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: HDDS-1753.000.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Leader datanode is unable to read chunk from the datanode while replicating 
> data from leader to follower.
> Please note that deletion of keys is also happening while the data is being 
> replicated.
> {code}
> 2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl 
> (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
> -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 1)
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler 
> (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace 
> ID

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=298733&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-298733
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 21/Aug/19 14:37
Start Date: 21/Aug/19 14:37
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#discussion_r316175405
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/XceiverServerRatis.java
 ##
 @@ -654,4 +646,18 @@ public void handleNodeLogFailure(RaftGroupId groupId, 
Throwable t) {
 triggerPipelineClose(groupId, msg,
 ClosePipelineInfo.Reason.PIPELINE_LOG_FAILED, true);
   }
+
+  public long getMinReplicatedIndex(PipelineID pipelineID) throws IOException {
+Long minIndex = null;
+Iterator raftGroupIterator = getServer().getGroups().iterator();
+while (raftGroupIterator.hasNext()) {
 
 Review comment:
   We can directly use the getServer().getGroupInfo(..) api here and do not 
need the while loop.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 298733)

> Datanode unable to find chunk while replication data using ratis.
> -
>
> Key: HDDS-1753
> URL: https://issues.apache.org/jira/browse/HDDS-1753
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: HDDS-1753.000.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Leader datanode is unable to read chunk from the datanode while replicating 
> data from leader to follower.
> Please note that deletion of keys is also happening while the data is being 
> replicated.
> {code}
> 2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl 
> (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
> -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 1)
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler 
> (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace 
> ID: 4216d461a4679e17:4216d461a4679e17:0:0 : Message: Unable to find the c
> hunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048} : Result: UNABLE_TO_FIND_CHUNK
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 2)
> 2019-07-02 19:39:22,606 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#72:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 19:39:22.606 [pool-195-thread-19] ERROR DNAudit - user=null | ip=null | 
> op=READ_CHUNK {blockData=conID: 3 locID: 102372189549953034 bcsId: 0} | 
> ret=FAILURE
> java.lang.Exception: Unable to find the chu

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=298731&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-298731
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 21/Aug/19 14:37
Start Date: 21/Aug/19 14:37
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#discussion_r316217524
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/ContainerSet.java
 ##
 @@ -236,14 +244,43 @@ public ContainerReportsProto getContainerReport() throws 
IOException {
   ContainerDeletionChoosingPolicy deletionPolicy)
   throws StorageContainerException {
 Map containerDataMap = 
containerMap.entrySet().stream()
-.filter(e -> deletionPolicy.isValidContainerType(
-e.getValue().getContainerType()))
-.collect(Collectors.toMap(Map.Entry::getKey,
-e -> e.getValue().getContainerData()));
+.filter(e ->
+
deletionPolicy.isValidContainerType(e.getValue().getContainerType())
+&& isDeletionAllowed(e.getValue().getContainerData())).collect(
+Collectors.toMap(Map.Entry::getKey,
+e -> e.getValue().getContainerData()));
 return deletionPolicy
 .chooseContainerForBlockDeletion(count, containerDataMap);
   }
 
+  private boolean isDeletionAllowed(ContainerData containerData) {
+if (containerData.isClosed()) {
+  if (writeChannel instanceof XceiverServerRatis) {
+try {
+  XceiverServerRatis ratisServer = (XceiverServerRatis) writeChannel;
+  long minReplicatedIndex = 
ratisServer.getMinReplicatedIndex(PipelineID
+  .valueOf(UUID.fromString(containerData.getOriginPipelineId(;
+  long containerBCSID = containerData.getBlockCommitSequenceId();
+  if (minReplicatedIndex != 0 && minReplicatedIndex < containerBCSID) {
 
 Review comment:
   There can be a corner case where minReplicatedIndex is 0 and pipeline is 
still active?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 298731)
Time Spent: 1h 10m  (was: 1h)

> Datanode unable to find chunk while replication data using ratis.
> -
>
> Key: HDDS-1753
> URL: https://issues.apache.org/jira/browse/HDDS-1753
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: HDDS-1753.000.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Leader datanode is unable to read chunk from the datanode while replicating 
> data from leader to follower.
> Please note that deletion of keys is also happening while the data is being 
> replicated.
> {code}
> 2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl 
> (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
> -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 1)
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler 
> (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace 
> ID: 4216d461a4679e17:4216d461a4679e17:0:0

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=298727&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-298727
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 21/Aug/19 14:37
Start Date: 21/Aug/19 14:37
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#discussion_r316220310
 
 

 ##
 File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestDeleteWithSlowFollower.java
 ##
 @@ -0,0 +1,274 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations 
under
+ * the License.
+ */
+
+package org.apache.hadoop.ozone.client.rpc;
+
+import org.apache.hadoop.hdds.client.BlockID;
+import org.apache.hadoop.hdds.client.ReplicationFactor;
+import org.apache.hadoop.hdds.client.ReplicationType;
+import org.apache.hadoop.hdds.conf.OzoneConfiguration;
+import org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos;
+import org.apache.hadoop.hdds.protocol.proto.HddsProtos;
+import org.apache.hadoop.hdds.scm.ScmConfigKeys;
+import org.apache.hadoop.hdds.scm.XceiverClientManager;
+import org.apache.hadoop.hdds.scm.XceiverClientSpi;
+import 
org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException;
+import org.apache.hadoop.hdds.scm.pipeline.Pipeline;
+import org.apache.hadoop.ozone.HddsDatanodeService;
+import org.apache.hadoop.ozone.MiniOzoneCluster;
+import org.apache.hadoop.ozone.OzoneConfigKeys;
+import org.apache.hadoop.ozone.client.ObjectStore;
+import org.apache.hadoop.ozone.client.OzoneClient;
+import org.apache.hadoop.ozone.client.OzoneClientFactory;
+import org.apache.hadoop.ozone.client.io.KeyOutputStream;
+import org.apache.hadoop.ozone.client.io.OzoneOutputStream;
+import org.apache.hadoop.ozone.container.ContainerTestHelper;
+import org.apache.hadoop.ozone.container.common.helpers.BlockData;
+import org.apache.hadoop.ozone.container.common.helpers.ChunkInfo;
+import org.apache.hadoop.ozone.container.common.interfaces.Container;
+import 
org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine;
+import 
org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine;
+import org.apache.hadoop.ozone.container.keyvalue.KeyValueContainerData;
+import org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler;
+import org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer;
+import org.apache.hadoop.ozone.om.helpers.OmKeyArgs;
+import org.apache.hadoop.ozone.om.helpers.OmKeyInfo;
+import org.apache.hadoop.ozone.om.helpers.OmKeyLocationInfo;
+import org.apache.hadoop.test.GenericTestUtils;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.HashMap;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+
+import static 
org.apache.hadoop.hdds.HddsConfigKeys.HDDS_COMMAND_STATUS_REPORT_INTERVAL;
+import static 
org.apache.hadoop.hdds.HddsConfigKeys.HDDS_CONTAINER_REPORT_INTERVAL;
+import static 
org.apache.hadoop.hdds.scm.ScmConfigKeys.HDDS_SCM_WATCHER_TIMEOUT;
+import static 
org.apache.hadoop.hdds.scm.ScmConfigKeys.OZONE_SCM_PIPELINE_DESTROY_TIMEOUT;
+import static 
org.apache.hadoop.hdds.scm.ScmConfigKeys.OZONE_SCM_STALENODE_INTERVAL;
+
+/**
+ * Tests delete key operation with a slow follower in the datanode
+ * pipeline.
+ */
+public class TestDeleteWithSlowFollower {
+
+  private static MiniOzoneCluster cluster;
+  private static OzoneConfiguration conf;
+  private static OzoneClient client;
+  private static ObjectStore objectStore;
+  private static String volumeName;
+  private static String bucketName;
+  private static String path;
+  private static XceiverClientManager xceiverClientManager;
+
+  /**
+   * Create a MiniDFSCluster for testing.
+   *
+   * @throws IOException
+   */
+  @BeforeClass
+  public static void init() throws Exception {
+conf = new OzoneConfiguration();
+path = Gener

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=298732&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-298732
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 21/Aug/19 14:37
Start Date: 21/Aug/19 14:37
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#discussion_r316083784
 
 

 ##
 File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/container/ContainerTestHelper.java
 ##
 @@ -68,11 +68,13 @@
 import org.apache.hadoop.ozone.container.common.impl.ContainerData;
 import org.apache.hadoop.ozone.container.common.interfaces.Container;
 import 
org.apache.hadoop.ozone.container.common.transport.server.XceiverServerSpi;
+import 
org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine;
 
 Review comment:
   unused import
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 298732)
Time Spent: 1h 10m  (was: 1h)

> Datanode unable to find chunk while replication data using ratis.
> -
>
> Key: HDDS-1753
> URL: https://issues.apache.org/jira/browse/HDDS-1753
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: HDDS-1753.000.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Leader datanode is unable to read chunk from the datanode while replicating 
> data from leader to follower.
> Please note that deletion of keys is also happening while the data is being 
> replicated.
> {code}
> 2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl 
> (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
> -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 1)
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler 
> (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace 
> ID: 4216d461a4679e17:4216d461a4679e17:0:0 : Message: Unable to find the c
> hunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048} : Result: UNABLE_TO_FIND_CHUNK
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 2)
> 2019-07-02 19:39:22,606 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#72:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 19:39:22.606 [pool-195-thread-19] ERROR DNAudit - user=null | ip=null | 
> op=READ_CHUNK {blockData=conID: 3 locID: 102372189549953034 bcsId: 0} | 
> ret=FAILURE
> java.lang.Exception: Unable to find the chunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048}
>  

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=298728&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-298728
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 21/Aug/19 14:37
Start Date: 21/Aug/19 14:37
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#discussion_r316173519
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/XceiverServerRatis.java
 ##
 @@ -50,14 +46,7 @@
 import org.apache.ratis.grpc.GrpcFactory;
 import org.apache.ratis.grpc.GrpcTlsConfig;
 import org.apache.ratis.netty.NettyConfigKeys;
-import org.apache.ratis.protocol.RaftClientRequest;
-import org.apache.ratis.protocol.Message;
-import org.apache.ratis.protocol.RaftClientReply;
-import org.apache.ratis.protocol.ClientId;
-import org.apache.ratis.protocol.NotLeaderException;
-import org.apache.ratis.protocol.StateMachineException;
-import org.apache.ratis.protocol.RaftPeerId;
-import org.apache.ratis.protocol.RaftGroupId;
+import org.apache.ratis.protocol.*;
 
 Review comment:
   star import at line 49 and 66.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 298728)
Time Spent: 1h  (was: 50m)

> Datanode unable to find chunk while replication data using ratis.
> -
>
> Key: HDDS-1753
> URL: https://issues.apache.org/jira/browse/HDDS-1753
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: HDDS-1753.000.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Leader datanode is unable to read chunk from the datanode while replicating 
> data from leader to follower.
> Please note that deletion of keys is also happening while the data is being 
> replicated.
> {code}
> 2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl 
> (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
> -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 1)
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler 
> (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace 
> ID: 4216d461a4679e17:4216d461a4679e17:0:0 : Message: Unable to find the c
> hunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048} : Result: UNABLE_TO_FIND_CHUNK
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 2)
> 2019-07-02 19:39:22,606 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#72:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 19:39:22.606 [pool-195-thread-19] ERROR DNAudit - user=null | ip=null | 
> op=

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=298723&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-298723
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 21/Aug/19 14:37
Start Date: 21/Aug/19 14:37
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#discussion_r316028608
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/ContainerSet.java
 ##
 @@ -55,6 +56,13 @@
   ConcurrentSkipListMap<>();
   private final ConcurrentSkipListSet missingContainerSet =
   new ConcurrentSkipListSet<>();
+
+  private XceiverServerSpi writeChannel = null;
 
 Review comment:
   Can we rename writeChannel to xceiverServer or sth like that?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 298723)
Time Spent: 0.5h  (was: 20m)

> Datanode unable to find chunk while replication data using ratis.
> -
>
> Key: HDDS-1753
> URL: https://issues.apache.org/jira/browse/HDDS-1753
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: HDDS-1753.000.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Leader datanode is unable to read chunk from the datanode while replicating 
> data from leader to follower.
> Please note that deletion of keys is also happening while the data is being 
> replicated.
> {code}
> 2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl 
> (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
> -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 1)
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler 
> (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace 
> ID: 4216d461a4679e17:4216d461a4679e17:0:0 : Message: Unable to find the c
> hunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048} : Result: UNABLE_TO_FIND_CHUNK
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 2)
> 2019-07-02 19:39:22,606 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#72:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 19:39:22.606 [pool-195-thread-19] ERROR DNAudit - user=null | ip=null | 
> op=READ_CHUNK {blockData=conID: 3 locID: 102372189549953034 bcsId: 0} | 
> ret=FAILURE
> java.lang.Exception: Unable to find the chunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048}
> at 
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchReques

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=298724&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-298724
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 21/Aug/19 14:37
Start Date: 21/Aug/19 14:37
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#discussion_r315820046
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/ContainerSet.java
 ##
 @@ -24,20 +24,21 @@
 import org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos;
 import org.apache.hadoop.hdds.protocol.proto
 .StorageContainerDatanodeProtocolProtos.ContainerReportsProto;
+import org.apache.hadoop.hdds.scm.XceiverClientSpi;
 
 Review comment:
   Unused import.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 298724)
Time Spent: 40m  (was: 0.5h)

> Datanode unable to find chunk while replication data using ratis.
> -
>
> Key: HDDS-1753
> URL: https://issues.apache.org/jira/browse/HDDS-1753
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: HDDS-1753.000.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Leader datanode is unable to read chunk from the datanode while replicating 
> data from leader to follower.
> Please note that deletion of keys is also happening while the data is being 
> replicated.
> {code}
> 2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl 
> (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
> -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 1)
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler 
> (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace 
> ID: 4216d461a4679e17:4216d461a4679e17:0:0 : Message: Unable to find the c
> hunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048} : Result: UNABLE_TO_FIND_CHUNK
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 2)
> 2019-07-02 19:39:22,606 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#72:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 19:39:22.606 [pool-195-thread-19] ERROR DNAudit - user=null | ip=null | 
> op=READ_CHUNK {blockData=conID: 3 locID: 102372189549953034 bcsId: 0} | 
> ret=FAILURE
> java.lang.Exception: Unable to find the chunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048}
> at 
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispat

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=298725&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-298725
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 21/Aug/19 14:37
Start Date: 21/Aug/19 14:37
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#discussion_r316029001
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/ContainerSet.java
 ##
 @@ -236,14 +244,43 @@ public ContainerReportsProto getContainerReport() throws 
IOException {
   ContainerDeletionChoosingPolicy deletionPolicy)
   throws StorageContainerException {
 Map containerDataMap = 
containerMap.entrySet().stream()
-.filter(e -> deletionPolicy.isValidContainerType(
-e.getValue().getContainerType()))
-.collect(Collectors.toMap(Map.Entry::getKey,
-e -> e.getValue().getContainerData()));
+.filter(e ->
+
deletionPolicy.isValidContainerType(e.getValue().getContainerType())
 
 Review comment:
   We can move the deletionPolicy check inside isDeletionAllowed function.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 298725)
Time Spent: 50m  (was: 40m)

> Datanode unable to find chunk while replication data using ratis.
> -
>
> Key: HDDS-1753
> URL: https://issues.apache.org/jira/browse/HDDS-1753
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: HDDS-1753.000.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Leader datanode is unable to read chunk from the datanode while replicating 
> data from leader to follower.
> Please note that deletion of keys is also happening while the data is being 
> replicated.
> {code}
> 2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl 
> (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
> -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 1)
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler 
> (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace 
> ID: 4216d461a4679e17:4216d461a4679e17:0:0 : Message: Unable to find the c
> hunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048} : Result: UNABLE_TO_FIND_CHUNK
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 2)
> 2019-07-02 19:39:22,606 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#72:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 19:39:22.606 [pool-195-thread-19] ERROR DNAudit - user=null | ip=null | 
> 

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=298726&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-298726
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 21/Aug/19 14:37
Start Date: 21/Aug/19 14:37
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on pull request #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#discussion_r315820233
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/ContainerSet.java
 ##
 @@ -24,20 +24,21 @@
 import org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos;
 import org.apache.hadoop.hdds.protocol.proto
 .StorageContainerDatanodeProtocolProtos.ContainerReportsProto;
+import org.apache.hadoop.hdds.scm.XceiverClientSpi;
 import org.apache.hadoop.hdds.scm.container.common.helpers
 .StorageContainerException;
+import org.apache.hadoop.hdds.scm.pipeline.PipelineID;
 import org.apache.hadoop.ozone.container.common.interfaces.Container;
 import org.apache.hadoop.ozone.container.common
 .interfaces.ContainerDeletionChoosingPolicy;
+import 
org.apache.hadoop.ozone.container.common.transport.server.XceiverServerSpi;
+import 
org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis;
 import org.apache.hadoop.ozone.container.common.volume.HddsVolume;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
 import java.io.IOException;
-import java.util.Iterator;
-import java.util.List;
-import java.util.Set;
-import java.util.Map;
+import java.util.*;
 
 Review comment:
   We can remove the star import.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 298726)
Time Spent: 50m  (was: 40m)

> Datanode unable to find chunk while replication data using ratis.
> -
>
> Key: HDDS-1753
> URL: https://issues.apache.org/jira/browse/HDDS-1753
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: HDDS-1753.000.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Leader datanode is unable to read chunk from the datanode while replicating 
> data from leader to follower.
> Please note that deletion of keys is also happening while the data is being 
> replicated.
> {code}
> 2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl 
> (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
> -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 1)
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler 
> (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace 
> ID: 4216d461a4679e17:4216d461a4679e17:0:0 : Message: Unable to find the c
> hunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048} : Result: UNABLE_TO_FIND_CHUNK
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (fi

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=297801&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297801
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 20/Aug/19 11:43
Start Date: 20/Aug/19 11:43
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1318: HDDS-1753. 
Datanode unable to find chunk while replication data using ratis.
URL: https://github.com/apache/hadoop/pull/1318#issuecomment-522975966
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 47 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 0 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 2 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 69 | Maven dependency ordering for branch |
   | +1 | mvninstall | 616 | trunk passed |
   | +1 | compile | 375 | trunk passed |
   | +1 | checkstyle | 62 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 793 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 156 | trunk passed |
   | 0 | spotbugs | 428 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | +1 | findbugs | 619 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 29 | Maven dependency ordering for patch |
   | +1 | mvninstall | 545 | the patch passed |
   | +1 | compile | 373 | the patch passed |
   | +1 | javac | 373 | the patch passed |
   | -0 | checkstyle | 35 | hadoop-hdds: The patch generated 4 new + 0 
unchanged - 0 fixed = 4 total (was 0) |
   | -0 | checkstyle | 34 | hadoop-ozone: The patch generated 3 new + 0 
unchanged - 0 fixed = 3 total (was 0) |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 630 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 154 | the patch passed |
   | +1 | findbugs | 620 | the patch passed |
   ||| _ Other Tests _ |
   | -1 | unit | 186 | hadoop-hdds in the patch failed. |
   | -1 | unit | 2317 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 39 | The patch does not generate ASF License warnings. |
   | | | 7864 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.ozone.container.ozoneimpl.TestOzoneContainer |
   |   | hadoop.ozone.container.common.impl.TestContainerDeletionChoosingPolicy 
|
   |   | hadoop.ozone.client.rpc.TestOzoneRpcClientForAclAuditLog |
   |   | hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures |
   |   | hadoop.ozone.container.common.TestBlockDeletingService |
   |   | 
hadoop.ozone.container.common.statemachine.commandhandler.TestBlockDeletion |
   |   | hadoop.ozone.container.server.TestSecureContainerServer |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=19.03.1 Server=19.03.1 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1318 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux aa103a651691 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 6244502 |
   | Default Java | 1.8.0_222 |
   | checkstyle | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/1/artifact/out/diff-checkstyle-hadoop-hdds.txt
 |
   | checkstyle | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/1/artifact/out/diff-checkstyle-hadoop-ozone.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/1/artifact/out/patch-unit-hadoop-hdds.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/1/artifact/out/patch-unit-hadoop-ozone.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/1/testReport/ |
   | Max. process+thread count | 4841 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/common hadoop-hdds/container-service 
hadoop-ozone/integration-test U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1318/1/console |
   | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 |
   | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Servi

[jira] [Work logged] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.

2019-08-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1753?focusedWorklogId=297717&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297717
 ]

ASF GitHub Bot logged work on HDDS-1753:


Author: ASF GitHub Bot
Created on: 20/Aug/19 09:31
Start Date: 20/Aug/19 09:31
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on pull request #1318: HDDS-1753. 
HDDS-1610. applyTransaction failure should not be lost on restart.
URL: https://github.com/apache/hadoop/pull/1318
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297717)
Remaining Estimate: 0h
Time Spent: 10m

> Datanode unable to find chunk while replication data using ratis.
> -
>
> Key: HDDS-1753
> URL: https://issues.apache.org/jira/browse/HDDS-1753
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: HDDS-1753.000.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Leader datanode is unable to read chunk from the datanode while replicating 
> data from leader to follower.
> Please note that deletion of keys is also happening while the data is being 
> replicated.
> {code}
> 2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl 
> (ChunkUtils.java:readData(161)) - Unable to find the chunk file. chunk info : 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
> -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 1)
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler 
> (ContainerUtils.java:logAndReturnError(146)) - Operation: ReadChunk : Trace 
> ID: 4216d461a4679e17:4216d461a4679e17:0:0 : Message: Unable to find the c
> hunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048} : Result: UNABLE_TO_FIND_CHUNK
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(990)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot 
> (9770) already h
> as the append entries (first index: 2)
> 2019-07-02 19:39:22,606 INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:checkInconsistentAppendEntries(972)) - 
> 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. 
> Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#72:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 19:39:22.606 [pool-195-thread-19] ERROR DNAudit - user=null | ip=null | 
> op=READ_CHUNK {blockData=conID: 3 locID: 102372189549953034 bcsId: 0} | 
> ret=FAILURE
> java.lang.Exception: Unable to find the chunk file. chunk info 
> ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
>  offset=0, len=2048}
> at 
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:320)
>  ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:148)
>  ~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCommand(ContainerStateMachine.java:346)
>  ~[hadoop-hdds-con