[
https://issues.apache.org/jira/browse/HDDS-9146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Siyao Meng updated HDDS-9146:
-----------------------------
Description:
It is observed when {{hsync()}} is called followed by a {{close()}} for a key
stream (which triggers two {{OMKeyCommitRequest}}, the first one with {{isHSync
= true}} and the second one with {{isHSync = false}}), {{deletedTable}} could
have an entry with the exact same block {{conID}} (container ID) and {{locId}}
(local ID) as the committed key in {{keyTable}}, which can cause OM's
{{KeyDeletingService}} to call SCM to remove the committed block by mistake.
The catch is, actual data loss won't happen until the container is closed, only
then will block deletion actually happen on DNs. CMIIW [~erose]
Repro integration test branch (based on [~erose]'s integration test based on my
initial draft):
https://github.com/smengcl/hadoop-ozone/tree/HDDS-9146-repro
Run integration test {{TestMiniOzoneCluster#testKeyRenameDirDelete}} for a
repro:
{code:title=Test log. See entries in keyTable and deletedTable share the same
block conID: 1 and locID: 111677748019200001}
2023-08-09 14:31:54,859 [main] WARN ozone.TestMiniOzoneCluster
(TestMiniOzoneCluster.java:testKeyRenameDirDelete(159)) - keyTable: -----
START -----
2023-08-09 14:31:54,860 [main] WARN ozone.TestMiniOzoneCluster
(TestMiniOzoneCluster.java:testKeyRenameDirDelete(168)) - keyTable: key =
/testozonevol/testozonebucket/inputTera/_temporary/1/_temporary/attempt_1691047336995_0006_m_000001_0/part-m-00001,
val = OmKeyInfo{volumeName='testozonevol', bucketName='testozonebucket',
keyName='inputTera/_temporary/1/_temporary/attempt_1691047336995_0006_m_000001_0/part-m-00001',
dataSize=11, keyLocationVersions=[OmKeyLocationInfoGroup{version=0,
locationVersionMap={0=[{blockID={conID: 1 locID: 111677748019200001 bcsId: 2},
length=11, offset=0, token=null, pipeline=null, createVersion=0,
partNumber=0}]}, isMultipartKey=false}], creationTime=1691616714661,
modificationTime=1691616714848, replicationConfig=RATIS/THREE, encInfo=null,
fileChecksum=null, isFile=true, fileName='part-m-00001'}
2023-08-09 14:31:54,860 [main] WARN ozone.TestMiniOzoneCluster
(TestMiniOzoneCluster.java:testKeyRenameDirDelete(171)) - keyTable: -----
END -----
2023-08-09 14:31:54,860 [main] WARN ozone.TestMiniOzoneCluster
(TestMiniOzoneCluster.java:testKeyRenameDirDelete(173)) - deletedTable: -----
START -----
2023-08-09 14:31:54,861 [main] WARN ozone.TestMiniOzoneCluster
(TestMiniOzoneCluster.java:testKeyRenameDirDelete(181)) - deletedTable: key =
/testozonevol/testozonebucket/inputTera/_temporary/1/_temporary/attempt_1691047336995_0006_m_000001_0/part-m-00001/-9223372036854774528,
val = RepeatedOmKeyInfo{omKeyInfoList=[OmKeyInfo{volumeName='testozonevol',
bucketName='testozonebucket',
keyName='inputTera/_temporary/1/_temporary/attempt_1691047336995_0006_m_000001_0/part-m-00001',
dataSize=11, keyLocationVersions=[OmKeyLocationInfoGroup{version=0,
locationVersionMap={0=[{blockID={conID: 1 locID: 111677748019200001 bcsId: 0},
length=11, offset=0, token=null, pipeline=null, createVersion=0,
partNumber=0}]}, isMultipartKey=false}], creationTime=1691616714661,
modificationTime=1691616714834, replicationConfig=RATIS/THREE, encInfo=null,
fileChecksum=null, isFile=true, fileName='part-m-00001'}]}
2023-08-09 14:31:54,861 [main] WARN ozone.TestMiniOzoneCluster
(TestMiniOzoneCluster.java:testKeyRenameDirDelete(184)) - deletedTable: -----
END -----
{code}
Sounds to me the fix should be to filter out any block that shares the same
containerId and locId as the keyTable/fileTable entry when adding to
deletedTable inside OMKeyCommitRequest / OMKeyCommitRequestWithFSO. But I'm no
expert in HSync so please advise. cc [~weichiu] [~szetszwo]
was:
It is observed when {{hsync()}} is called followed by a {{close()}} for a key
stream (which triggers two {{OMKeyCommitRequest}}, the first one with {{isHSync
= true}} and the second one with {{isHSync = false}}), {{deletedTable}} could
have an entry with the exact same block {{conID}} (container ID) and {{locId}}
(local ID) as the committed key in {{keyTable}}, which can cause OM's
{{KeyDeletingService}} to call SCM to remove the committed block by mistake.
The catch is, actual data loss won't happen until the container is closed, only
then will block deletion actually happen on DNs. CMIIW [~erose]
Repro integration test branch (based on [~erose]'s integration test based on my
initial draft):
https://github.com/smengcl/hadoop-ozone/tree/HDDS-9146-repro
Run integration test {{TestMiniOzoneCluster#testKeyRenameDirDelete}} for a
repro:
{code:title=Test log. See entries in keyTable and deletedTable share the same
block conID: 1 and locID: 111677748019200001}
2023-08-09 14:31:54,859 [main] WARN ozone.TestMiniOzoneCluster
(TestMiniOzoneCluster.java:testKeyRenameDirDelete(159)) - keyTable: -----
START -----
2023-08-09 14:31:54,860 [main] WARN ozone.TestMiniOzoneCluster
(TestMiniOzoneCluster.java:testKeyRenameDirDelete(168)) - keyTable: key =
/testozonevol/testozonebucket/inputTera/_temporary/1/_temporary/attempt_1691047336995_0006_m_000001_0/part-m-00001,
val = OmKeyInfo{volumeName='testozonevol', bucketName='testozonebucket',
keyName='inputTera/_temporary/1/_temporary/attempt_1691047336995_0006_m_000001_0/part-m-00001',
dataSize=11, keyLocationVersions=[OmKeyLocationInfoGroup{version=0,
locationVersionMap={0=[{blockID={conID: 1 locID: 111677748019200001 bcsId: 2},
length=11, offset=0, token=null, pipeline=null, createVersion=0,
partNumber=0}]}, isMultipartKey=false}], creationTime=1691616714661,
modificationTime=1691616714848, replicationConfig=RATIS/THREE, encInfo=null,
fileChecksum=null, isFile=true, fileName='part-m-00001'}
2023-08-09 14:31:54,860 [main] WARN ozone.TestMiniOzoneCluster
(TestMiniOzoneCluster.java:testKeyRenameDirDelete(171)) - keyTable: -----
END -----
2023-08-09 14:31:54,860 [main] WARN ozone.TestMiniOzoneCluster
(TestMiniOzoneCluster.java:testKeyRenameDirDelete(173)) - deletedTable: -----
START -----
2023-08-09 14:31:54,861 [main] WARN ozone.TestMiniOzoneCluster
(TestMiniOzoneCluster.java:testKeyRenameDirDelete(181)) - deletedTable: key =
/testozonevol/testozonebucket/inputTera/_temporary/1/_temporary/attempt_1691047336995_0006_m_000001_0/part-m-00001/-9223372036854774528,
val = RepeatedOmKeyInfo{omKeyInfoList=[OmKeyInfo{volumeName='testozonevol',
bucketName='testozonebucket',
keyName='inputTera/_temporary/1/_temporary/attempt_1691047336995_0006_m_000001_0/part-m-00001',
dataSize=11, keyLocationVersions=[OmKeyLocationInfoGroup{version=0,
locationVersionMap={0=[{blockID={conID: 1 locID: 111677748019200001 bcsId: 0},
length=11, offset=0, token=null, pipeline=null, createVersion=0,
partNumber=0}]}, isMultipartKey=false}], creationTime=1691616714661,
modificationTime=1691616714834, replicationConfig=RATIS/THREE, encInfo=null,
fileChecksum=null, isFile=true, fileName='part-m-00001'}]}
2023-08-09 14:31:54,861 [main] WARN ozone.TestMiniOzoneCluster
(TestMiniOzoneCluster.java:testKeyRenameDirDelete(184)) - deletedTable: -----
END -----
{code}
Sounds to me the fix should be to filter out any block that shares the same
containerId and locId when adding entry into deletedTable inside
OMKeyCommitRequest / OMKeyCommitRequestWithFSO. But I'm no expert in HSync so
please advise. cc [~weichiu] [~szetszwo]
> Potential data loss with HSync due to deletedTable addition in
> OMKeyCommitRequest
> ---------------------------------------------------------------------------------
>
> Key: HDDS-9146
> URL: https://issues.apache.org/jira/browse/HDDS-9146
> Project: Apache Ozone
> Issue Type: Bug
> Reporter: Siyao Meng
> Priority: Critical
>
> It is observed when {{hsync()}} is called followed by a {{close()}} for a key
> stream (which triggers two {{OMKeyCommitRequest}}, the first one with
> {{isHSync = true}} and the second one with {{isHSync = false}}),
> {{deletedTable}} could have an entry with the exact same block {{conID}}
> (container ID) and {{locId}} (local ID) as the committed key in {{keyTable}},
> which can cause OM's {{KeyDeletingService}} to call SCM to remove the
> committed block by mistake.
> The catch is, actual data loss won't happen until the container is closed,
> only then will block deletion actually happen on DNs. CMIIW [~erose]
> Repro integration test branch (based on [~erose]'s integration test based on
> my initial draft):
> https://github.com/smengcl/hadoop-ozone/tree/HDDS-9146-repro
> Run integration test {{TestMiniOzoneCluster#testKeyRenameDirDelete}} for a
> repro:
> {code:title=Test log. See entries in keyTable and deletedTable share the same
> block conID: 1 and locID: 111677748019200001}
> 2023-08-09 14:31:54,859 [main] WARN ozone.TestMiniOzoneCluster
> (TestMiniOzoneCluster.java:testKeyRenameDirDelete(159)) - keyTable: -----
> START -----
> 2023-08-09 14:31:54,860 [main] WARN ozone.TestMiniOzoneCluster
> (TestMiniOzoneCluster.java:testKeyRenameDirDelete(168)) - keyTable: key =
> /testozonevol/testozonebucket/inputTera/_temporary/1/_temporary/attempt_1691047336995_0006_m_000001_0/part-m-00001,
> val = OmKeyInfo{volumeName='testozonevol', bucketName='testozonebucket',
> keyName='inputTera/_temporary/1/_temporary/attempt_1691047336995_0006_m_000001_0/part-m-00001',
> dataSize=11, keyLocationVersions=[OmKeyLocationInfoGroup{version=0,
> locationVersionMap={0=[{blockID={conID: 1 locID: 111677748019200001 bcsId:
> 2}, length=11, offset=0, token=null, pipeline=null, createVersion=0,
> partNumber=0}]}, isMultipartKey=false}], creationTime=1691616714661,
> modificationTime=1691616714848, replicationConfig=RATIS/THREE, encInfo=null,
> fileChecksum=null, isFile=true, fileName='part-m-00001'}
> 2023-08-09 14:31:54,860 [main] WARN ozone.TestMiniOzoneCluster
> (TestMiniOzoneCluster.java:testKeyRenameDirDelete(171)) - keyTable: -----
> END -----
> 2023-08-09 14:31:54,860 [main] WARN ozone.TestMiniOzoneCluster
> (TestMiniOzoneCluster.java:testKeyRenameDirDelete(173)) - deletedTable: -----
> START -----
> 2023-08-09 14:31:54,861 [main] WARN ozone.TestMiniOzoneCluster
> (TestMiniOzoneCluster.java:testKeyRenameDirDelete(181)) - deletedTable: key =
> /testozonevol/testozonebucket/inputTera/_temporary/1/_temporary/attempt_1691047336995_0006_m_000001_0/part-m-00001/-9223372036854774528,
> val = RepeatedOmKeyInfo{omKeyInfoList=[OmKeyInfo{volumeName='testozonevol',
> bucketName='testozonebucket',
> keyName='inputTera/_temporary/1/_temporary/attempt_1691047336995_0006_m_000001_0/part-m-00001',
> dataSize=11, keyLocationVersions=[OmKeyLocationInfoGroup{version=0,
> locationVersionMap={0=[{blockID={conID: 1 locID: 111677748019200001 bcsId:
> 0}, length=11, offset=0, token=null, pipeline=null, createVersion=0,
> partNumber=0}]}, isMultipartKey=false}], creationTime=1691616714661,
> modificationTime=1691616714834, replicationConfig=RATIS/THREE, encInfo=null,
> fileChecksum=null, isFile=true, fileName='part-m-00001'}]}
> 2023-08-09 14:31:54,861 [main] WARN ozone.TestMiniOzoneCluster
> (TestMiniOzoneCluster.java:testKeyRenameDirDelete(184)) - deletedTable: -----
> END -----
> {code}
> Sounds to me the fix should be to filter out any block that shares the same
> containerId and locId as the keyTable/fileTable entry when adding to
> deletedTable inside OMKeyCommitRequest / OMKeyCommitRequestWithFSO. But I'm
> no expert in HSync so please advise. cc [~weichiu] [~szetszwo]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]