[
https://issues.apache.org/jira/browse/HDDS-15305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HDDS-15305:
----------------------------------
Labels: pull-request-available (was: )
> ozone admin container list --all returns duplicate containers due to
> sort/pagination key mismatch in SCM
> --------------------------------------------------------------------------------------------------------
>
> Key: HDDS-15305
> URL: https://issues.apache.org/jira/browse/HDDS-15305
> Project: Apache Ozone
> Issue Type: Bug
> Reporter: Sreeja
> Assignee: Sreeja
> Priority: Major
> Labels: pull-request-available
>
> The *ozone admin container list --all* command produces duplicate container
> entries in its output. The duplicates are non-deterministic across runs,
> which containers are duplicated depends on the order in which the SCM returns
> them, which can vary.
> Root Cause:
> The bug is a mismatch between the server-side sort key and the client-side
> pagination key Server side
> (SCMClientProtocolServer.listContainerInternal):The method filters containers
> by containerID >= startContainerID, then calls .sorted() which invokes
> ContainerInfo.compareTo(). That comparator is defined as:
>
> {code:java}
> private static final Comparator<ContainerInfo> COMPARATOR =
> Comparator.comparingLong(info -> info.getLastUsed().toEpochMilli()); {code}
> So the batch is returned sorted by lastUsed timestamp, not by containerID.
> Client side (ListSubcommand.listAllContainers): The pagination loop advances
> the cursor using the last returned element's container ID.
> This assumes the last element of the batch has the highest container ID. That
> is only true if the batch is sorted by container ID. Since it is actually
> sorted by lastUsed, the last element can have a lower container ID than other
> elements already in the same batch.
>
> Example: With batch size 40, a batch may return containers with IDs [..., 41,
> 42, 44, 40] (sorted by lastUsed). The cursor is set to 40 + 1 = 41. The next
> batch fetches containers with containerID >= 41, returning [41, 42, 43, 44,
> 45, ...] — re-fetching 41, 42, and 44.
>
> Fix :
> In SCMClientProtocolServer.listContainerInternal, replace the lastUsed-based
> sort with a containerID-based sort, aligning the sort key with the pagination
> key:
> {code:java}
> .sorted(Comparator.comparing(ContainerInfo::containerID)){code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]