[ 
https://issues.apache.org/jira/browse/HDDS-9146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyao Meng updated HDDS-9146:
-----------------------------
    Description: 
It is observed when {{hsync()}} is called followed by a {{close()}} for a key 
stream (which triggers two {{OMKeyCommitRequest}}, the first one with {{isHSync 
= true}} and the second one with {{isHSync = false}}), {{deletedTable}} could 
have an entry with the exact same block {{conID}} (container ID) and {{locId}} 
(local ID) as the committed key in {{keyTable}}, which can cause OM's 
{{KeyDeletingService}} to call SCM to remove the committed block by mistake.

The catch is, actual data loss won't happen until the container is closed, only 
then will block deletion actually happen on DNs. CMIIW [~erose]

Repro integration test branch (based on [~erose]'s integration test based on my 
initial draft):

https://github.com/smengcl/hadoop-ozone/tree/HDDS-9146-repro

Run integration test {{TestMiniOzoneCluster#testKeyRenameDirDelete}} for a 
repro:

{code:title=Test log. See entries in keyTable and deletedTable share the same 
block conID: 1 and locID: 111677748019200001}
2023-08-09 14:31:54,859 [main] WARN  ozone.TestMiniOzoneCluster 
(TestMiniOzoneCluster.java:testKeyRenameDirDelete(159)) - keyTable:     ----- 
START -----
2023-08-09 14:31:54,860 [main] WARN  ozone.TestMiniOzoneCluster 
(TestMiniOzoneCluster.java:testKeyRenameDirDelete(168)) - keyTable:     key = 
/testozonevol/testozonebucket/inputTera/_temporary/1/_temporary/attempt_1691047336995_0006_m_000001_0/part-m-00001,
 val = OmKeyInfo{volumeName='testozonevol', bucketName='testozonebucket', 
keyName='inputTera/_temporary/1/_temporary/attempt_1691047336995_0006_m_000001_0/part-m-00001',
 dataSize=11, keyLocationVersions=[OmKeyLocationInfoGroup{version=0, 
locationVersionMap={0=[{blockID={conID: 1 locID: 111677748019200001 bcsId: 2}, 
length=11, offset=0, token=null, pipeline=null, createVersion=0, 
partNumber=0}]}, isMultipartKey=false}], creationTime=1691616714661, 
modificationTime=1691616714848, replicationConfig=RATIS/THREE, encInfo=null, 
fileChecksum=null, isFile=true, fileName='part-m-00001'}
2023-08-09 14:31:54,860 [main] WARN  ozone.TestMiniOzoneCluster 
(TestMiniOzoneCluster.java:testKeyRenameDirDelete(171)) - keyTable:     -----  
END  -----
2023-08-09 14:31:54,860 [main] WARN  ozone.TestMiniOzoneCluster 
(TestMiniOzoneCluster.java:testKeyRenameDirDelete(173)) - deletedTable: ----- 
START -----
2023-08-09 14:31:54,861 [main] WARN  ozone.TestMiniOzoneCluster 
(TestMiniOzoneCluster.java:testKeyRenameDirDelete(181)) - deletedTable: key = 
/testozonevol/testozonebucket/inputTera/_temporary/1/_temporary/attempt_1691047336995_0006_m_000001_0/part-m-00001/-9223372036854774528,
 val = RepeatedOmKeyInfo{omKeyInfoList=[OmKeyInfo{volumeName='testozonevol', 
bucketName='testozonebucket', 
keyName='inputTera/_temporary/1/_temporary/attempt_1691047336995_0006_m_000001_0/part-m-00001',
 dataSize=11, keyLocationVersions=[OmKeyLocationInfoGroup{version=0, 
locationVersionMap={0=[{blockID={conID: 1 locID: 111677748019200001 bcsId: 0}, 
length=11, offset=0, token=null, pipeline=null, createVersion=0, 
partNumber=0}]}, isMultipartKey=false}], creationTime=1691616714661, 
modificationTime=1691616714834, replicationConfig=RATIS/THREE, encInfo=null, 
fileChecksum=null, isFile=true, fileName='part-m-00001'}]}
2023-08-09 14:31:54,861 [main] WARN  ozone.TestMiniOzoneCluster 
(TestMiniOzoneCluster.java:testKeyRenameDirDelete(184)) - deletedTable: -----  
END  -----
{code}

Sounds to me the fix should be to filter out any block that shares the same 
containerId and locId when adding entry into deletedTable inside 
OMKeyCommitRequest / OMKeyCommitRequestWithFSO. But I'm no expert in HSync so 
please advise. cc [~weichiu] [~szetszwo]

  was:
It is observed when {{hsync()}} is called followed by a {{close()}} for a key 
stream (which triggers two {{OMKeyCommitRequest}}, the first one with {{isHSync 
= true}} and the second one with {{isHSync = false}}), {{deletedTable}} could 
have an entry with the exact same block {{conID}} (container ID) and {{locId}} 
(local ID) as the committed key in {{keyTable}}, which can cause OM's 
{{KeyDeletingService}} to call SCM to remove the committed block by mistake.

The catch is, actual data loss won't happen until the container is closed, only 
then will block deletion actually happen on DNs. CMIIW [~erose]

Repro integration test branch (based on [~erose]'s integration test based on my 
initial draft):

https://github.com/smengcl/hadoop-ozone/tree/HDDS-9146-repro

Run integration test {{TestMiniOzoneCluster#testKeyRenameDirDelete}} for a 
repro:

{code:title=Test log. See entry in keyTable and deletedTable has the same block 
conID: 1 and locID: 111677748019200001}
2023-08-09 14:31:54,859 [main] WARN  ozone.TestMiniOzoneCluster 
(TestMiniOzoneCluster.java:testKeyRenameDirDelete(159)) - keyTable:     ----- 
START -----
2023-08-09 14:31:54,860 [main] WARN  ozone.TestMiniOzoneCluster 
(TestMiniOzoneCluster.java:testKeyRenameDirDelete(168)) - keyTable:     key = 
/testozonevol/testozonebucket/inputTera/_temporary/1/_temporary/attempt_1691047336995_0006_m_000001_0/part-m-00001,
 val = OmKeyInfo{volumeName='testozonevol', bucketName='testozonebucket', 
keyName='inputTera/_temporary/1/_temporary/attempt_1691047336995_0006_m_000001_0/part-m-00001',
 dataSize=11, keyLocationVersions=[OmKeyLocationInfoGroup{version=0, 
locationVersionMap={0=[{blockID={conID: 1 locID: 111677748019200001 bcsId: 2}, 
length=11, offset=0, token=null, pipeline=null, createVersion=0, 
partNumber=0}]}, isMultipartKey=false}], creationTime=1691616714661, 
modificationTime=1691616714848, replicationConfig=RATIS/THREE, encInfo=null, 
fileChecksum=null, isFile=true, fileName='part-m-00001'}
2023-08-09 14:31:54,860 [main] WARN  ozone.TestMiniOzoneCluster 
(TestMiniOzoneCluster.java:testKeyRenameDirDelete(171)) - keyTable:     -----  
END  -----
2023-08-09 14:31:54,860 [main] WARN  ozone.TestMiniOzoneCluster 
(TestMiniOzoneCluster.java:testKeyRenameDirDelete(173)) - deletedTable: ----- 
START -----
2023-08-09 14:31:54,861 [main] WARN  ozone.TestMiniOzoneCluster 
(TestMiniOzoneCluster.java:testKeyRenameDirDelete(181)) - deletedTable: key = 
/testozonevol/testozonebucket/inputTera/_temporary/1/_temporary/attempt_1691047336995_0006_m_000001_0/part-m-00001/-9223372036854774528,
 val = RepeatedOmKeyInfo{omKeyInfoList=[OmKeyInfo{volumeName='testozonevol', 
bucketName='testozonebucket', 
keyName='inputTera/_temporary/1/_temporary/attempt_1691047336995_0006_m_000001_0/part-m-00001',
 dataSize=11, keyLocationVersions=[OmKeyLocationInfoGroup{version=0, 
locationVersionMap={0=[{blockID={conID: 1 locID: 111677748019200001 bcsId: 0}, 
length=11, offset=0, token=null, pipeline=null, createVersion=0, 
partNumber=0}]}, isMultipartKey=false}], creationTime=1691616714661, 
modificationTime=1691616714834, replicationConfig=RATIS/THREE, encInfo=null, 
fileChecksum=null, isFile=true, fileName='part-m-00001'}]}
2023-08-09 14:31:54,861 [main] WARN  ozone.TestMiniOzoneCluster 
(TestMiniOzoneCluster.java:testKeyRenameDirDelete(184)) - deletedTable: -----  
END  -----
{code}

Sounds to me the fix should be to filter out any block that shares the same 
containerId and locId when adding entry into deletedTable inside 
OMKeyCommitRequest / OMKeyCommitRequestWithFSO. But I'm no expert in HSync so 
please advise. cc [~weichiu] [~szetszwo]


> Potential data loss with HSync due to deletedTable addition in 
> OMKeyCommitRequest
> ---------------------------------------------------------------------------------
>
>                 Key: HDDS-9146
>                 URL: https://issues.apache.org/jira/browse/HDDS-9146
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Siyao Meng
>            Priority: Critical
>
> It is observed when {{hsync()}} is called followed by a {{close()}} for a key 
> stream (which triggers two {{OMKeyCommitRequest}}, the first one with 
> {{isHSync = true}} and the second one with {{isHSync = false}}), 
> {{deletedTable}} could have an entry with the exact same block {{conID}} 
> (container ID) and {{locId}} (local ID) as the committed key in {{keyTable}}, 
> which can cause OM's {{KeyDeletingService}} to call SCM to remove the 
> committed block by mistake.
> The catch is, actual data loss won't happen until the container is closed, 
> only then will block deletion actually happen on DNs. CMIIW [~erose]
> Repro integration test branch (based on [~erose]'s integration test based on 
> my initial draft):
> https://github.com/smengcl/hadoop-ozone/tree/HDDS-9146-repro
> Run integration test {{TestMiniOzoneCluster#testKeyRenameDirDelete}} for a 
> repro:
> {code:title=Test log. See entries in keyTable and deletedTable share the same 
> block conID: 1 and locID: 111677748019200001}
> 2023-08-09 14:31:54,859 [main] WARN  ozone.TestMiniOzoneCluster 
> (TestMiniOzoneCluster.java:testKeyRenameDirDelete(159)) - keyTable:     ----- 
> START -----
> 2023-08-09 14:31:54,860 [main] WARN  ozone.TestMiniOzoneCluster 
> (TestMiniOzoneCluster.java:testKeyRenameDirDelete(168)) - keyTable:     key = 
> /testozonevol/testozonebucket/inputTera/_temporary/1/_temporary/attempt_1691047336995_0006_m_000001_0/part-m-00001,
>  val = OmKeyInfo{volumeName='testozonevol', bucketName='testozonebucket', 
> keyName='inputTera/_temporary/1/_temporary/attempt_1691047336995_0006_m_000001_0/part-m-00001',
>  dataSize=11, keyLocationVersions=[OmKeyLocationInfoGroup{version=0, 
> locationVersionMap={0=[{blockID={conID: 1 locID: 111677748019200001 bcsId: 
> 2}, length=11, offset=0, token=null, pipeline=null, createVersion=0, 
> partNumber=0}]}, isMultipartKey=false}], creationTime=1691616714661, 
> modificationTime=1691616714848, replicationConfig=RATIS/THREE, encInfo=null, 
> fileChecksum=null, isFile=true, fileName='part-m-00001'}
> 2023-08-09 14:31:54,860 [main] WARN  ozone.TestMiniOzoneCluster 
> (TestMiniOzoneCluster.java:testKeyRenameDirDelete(171)) - keyTable:     ----- 
>  END  -----
> 2023-08-09 14:31:54,860 [main] WARN  ozone.TestMiniOzoneCluster 
> (TestMiniOzoneCluster.java:testKeyRenameDirDelete(173)) - deletedTable: ----- 
> START -----
> 2023-08-09 14:31:54,861 [main] WARN  ozone.TestMiniOzoneCluster 
> (TestMiniOzoneCluster.java:testKeyRenameDirDelete(181)) - deletedTable: key = 
> /testozonevol/testozonebucket/inputTera/_temporary/1/_temporary/attempt_1691047336995_0006_m_000001_0/part-m-00001/-9223372036854774528,
>  val = RepeatedOmKeyInfo{omKeyInfoList=[OmKeyInfo{volumeName='testozonevol', 
> bucketName='testozonebucket', 
> keyName='inputTera/_temporary/1/_temporary/attempt_1691047336995_0006_m_000001_0/part-m-00001',
>  dataSize=11, keyLocationVersions=[OmKeyLocationInfoGroup{version=0, 
> locationVersionMap={0=[{blockID={conID: 1 locID: 111677748019200001 bcsId: 
> 0}, length=11, offset=0, token=null, pipeline=null, createVersion=0, 
> partNumber=0}]}, isMultipartKey=false}], creationTime=1691616714661, 
> modificationTime=1691616714834, replicationConfig=RATIS/THREE, encInfo=null, 
> fileChecksum=null, isFile=true, fileName='part-m-00001'}]}
> 2023-08-09 14:31:54,861 [main] WARN  ozone.TestMiniOzoneCluster 
> (TestMiniOzoneCluster.java:testKeyRenameDirDelete(184)) - deletedTable: ----- 
>  END  -----
> {code}
> Sounds to me the fix should be to filter out any block that shares the same 
> containerId and locId when adding entry into deletedTable inside 
> OMKeyCommitRequest / OMKeyCommitRequestWithFSO. But I'm no expert in HSync so 
> please advise. cc [~weichiu] [~szetszwo]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to