oven-yang opened a new issue, #6288:
URL: https://github.com/apache/opendal/issues/6288

   ### Describe the bug
   
   While using `opendal` with feature `services-hdfs` to access HDFS, I 
observed that the write operation can hang when an HDFS write fails, such as 
due to a "quota exceeded" error. 
   
   After adding debug logs, I traced the hang to the `HdfsWriter::write` method.
   
   ```rust
   impl oio::Write for HdfsWriter<hdrs::AsyncFile> {
       async fn write(&mut self, mut bs: Buffer) -> Result<()> {
           let len = bs.len() as u64;
           let f = self.f.as_mut().expect("HdfsWriter must be initialized");
   
           while bs.has_remaining() {
               let n = f.write(bs.chunk()).await.map_err(new_std_io_error)?;
               bs.advance(n);
           }
   
           self.size += len;
           Ok(())
       }
   }
   ```
   
   Specifically, the `f.write` consistently returns 0 in failure scenarios, 
leading to an infinite loop. The root cause appears to be related to 
[issue](https://github.com/smol-rs/blocking/issues/70) in the blocking crate. 
   
   I would greatly appreciate it if someone could review and confirm this issue.
   
   ### Steps to Reproduce
   
   1. Modify the `Cargo.toml` file in opendal to depend on a local version of 
the `hdrs` crate.
   2. Alter the `&file::write` and `file::write` functions in the local `hdrs` 
crate to always return an `Err`.
   
   ```rust
   impl Write for File {
       fn write(&mut self, buf: &[u8]) -> Result<usize> {
           println!("fail write, file:{}, len:{}", self.path, buf.len());
           return Err(ErrorKind::Interrupted.into());
       }
   }
   
   impl Write for &File {
       fn write(&mut self, buf: &[u8]) -> Result<usize> {
           println!("fail write&, file:{}, len:{}", self.path, buf.len());
           return Err(ErrorKind::Interrupted.into());
       }
   }
   ```
   
   3. Write a demo program that uses the modified `opendal` to write to an HDFS 
file.
   
   ```rust
   #[tokio::main]
   async fn main() {
       let url = Url::parse("hdfs://xxxxxxx/user/someone/test_write").unwrap();
       let path = url.path();
       let name_node = format!("{}://{}", url.scheme(), 
url.host_str().unwrap());
   
       let config = services::Hdfs::default()
           .name_node(name_node.as_str())
           .root("/")
           .kerberos_ticket_cache_path("/path/to/cache")
           .enable_append(true);
       let op = Operator::new(config).unwrap().finish();
   
       let mut writer = op.writer(path).await.unwrap();
   
       let data = bytes::Bytes::from(vec![0; 100 * 1024 * 1024]);
       writer.write(data).await.unwrap(); // hang there
       writer.close().await.unwrap();
       drop(writer);
   }
   ```
   
   ### Expected Behavior
   
   the program will hang in line `writer.write(data).await.unwrap()` and will 
never recover.
   
   ### Additional Context
   
   _No response_
   
   ### Are you willing to submit a PR to fix this bug?
   
   - [x] Yes, I would like to submit a PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@opendal.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to