[ https://issues.apache.org/jira/browse/HDDS-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Marton Elek updated HDDS-3214: ------------------------------ Priority: Blocker (was: Major) > Unhealthy datanodes repeatedly participate in pipeline creation > --------------------------------------------------------------- > > Key: HDDS-3214 > URL: https://issues.apache.org/jira/browse/HDDS-3214 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM > Reporter: Nilotpal Nandi > Assignee: Prashant Pogde > Priority: Blocker > Labels: TriagePending, fault_injection > > steps taken : > 1) Mounted noise injection FUSE on all datanodes > 2) Selected 1 datanode from each open pipeline (factor=3) > 3) Injected WRITE FAILURE noise with error code - ENOENT on > "hdds.datanode.dir" path of list of datanodes selected in step 2) > 4) start PUT key operation of size 32 MB. > > Observation : > ---------------- > # Commit failed, pipelines were moved to exclusion list. > # Client retries , new pipeline is created with same set of datanodes. > Container creation fails as WRITE FAILURE injection present. > # Pipeline is closed and the process is repeated for > "ozone.client.max.retries" retries. > Everytime, same set of datanodes are selected for pipeline creation which > include 1 unhealthy datanode. > Expectation - pipeline should have been created by selecting 3 healthy > datanodes available. > > cc - [~ljain] > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org