zhouao created HADOOP-19373:
-------------------------------
Summary: AliyunOSS: When using the OSS Connector, special
circumstances can lead to deadlocks in IO threads, causing computation tasks to
become stuck
Key: HADOOP-19373
URL: https://issues.apache.org/jira/browse/HADOOP-19373
Project: Hadoop Common
Issue Type: Bug
Components: fs/oss
Affects Versions: 3.4.1, 3.3.6
Reporter: zhouao
!https://aone.alibaba-inc.com/v2/api/workitem/adapter/file/url?fileIdentifier=workitem%2Falibaba%2F1026733%2F1733219963897image.png!
When using {{AliyunOSSBlockOutputStream}} to read data, the threads waiting to
read data and the threads accessing OSS to fetch data are coordinated using a
condition lock. We have observed that under certain special circumstances, such
as incorrect environment configurations or misconfigurations of third-party
dependencies, the threads responsible for fetching data can abnormally
terminate. After termination, these threads do not release the locks they hold,
leading to indefinite waiting by other threads. From the perspective of the
computation engine, this tasks being stalled (but not failing).
We believe that in such situations, it is best for the computation engine to
detect the error and exit accordingly, rather than remain indefinitely stuck.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]