[
https://issues.apache.org/jira/browse/HDDS-9142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17752736#comment-17752736
]
Siddhant Sangwan commented on HDDS-9142:
----------------------------------------
Some early observations.
I've first started looking into this exception:
{code:java}
2023-08-05 13:40:52,947 WARN [IPC Server handler 35 on
9863]-org.apache.hadoop.hdds.scm.pipeline.WritableECContainerProvider: Unable
to allocate a container after trying 0 existing ones; requested size=314572800,
replication=EC{rs-3-2-1024k}, owner=om1546336201, ExcludeList {datanodes = [],
containerIds = [], pipelineIds =
[PipelineID=dd2decc1-9f9e-42b3-9651-0f081f1b7d17,
PipelineID=fd6aff3c-ddb6-4841-9417-3c938478c398,
PipelineID=dee07b7b-5e36-4921-8c06-160133ffe5df,
PipelineID=3731a0fe-1fc5-4a4a-823e-e09e2acb07a6,
PipelineID=d104af80-567c-445d-aa72-0fccefdde368]}
java.io.IOException: Pipeline limit (5) reached (5), none closed
{code}
I tracked each of those 5 pipelines mentioned in {{ExcludeList}}. Each of the 5
associated containers were closed earlier, so none of these existing pipelines
are available for writes. However, none of these 5 pipelines were closed when
this {{allocateContainer}} request came in. So these pipelines are present in
the excludeList and contributing to the count of open pipelines, while the
reality is that they should actually be closed.
> EC write fails when there are only 5 DNs up with rs-3-2-1024k policy
> --------------------------------------------------------------------
>
> Key: HDDS-9142
> URL: https://issues.apache.org/jira/browse/HDDS-9142
> Project: Apache Ozone
> Issue Type: Bug
> Components: EC
> Reporter: Varsha Ravi
> Assignee: Siddhant Sangwan
> Priority: Major
>
> When rs-3-2-1024k policy is set, and put key operation is done, the write
> fails with the below error when there are only 5 DNs up.
> {noformat}
> 23/08/05 15:36:35 WARN io.KeyOutputStream: Failure for replica index: 5,
> DatanodeDetails: dd57542f-2821-4be7-bc1e-be7c8027e109
> java.io.IOException: Unexpected Storage Container Exception:
> org.apache.hadoop.hdds.scm.container.common.helpers.ContainerNotOpenException:
> Requested operation not allowed as ContainerState is CLOSED
> Caused by:
> org.apache.hadoop.hdds.scm.container.common.helpers.ContainerNotOpenException:
> Requested operation not allowed as ContainerState is CLOSED
> ... 6 more
> INTERNAL_ERROR Pipeline limit (5) reached (5), none closed{noformat}
> The cluster has default value forĀ *ozone.scm.ec.pipeline.minimum* which is 5.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]