milenkovicm commented on code in PR #1388:
URL: 
https://github.com/apache/datafusion-ballista/pull/1388#discussion_r2701480074


##########
ballista/core/src/client.rs:
##########
@@ -287,6 +289,122 @@ impl BallistaClient {
     }
 }
 
+/// A connection pool for reusing `BallistaClient` connections to executors.
+///
+/// This pool caches connections by (host, port) to avoid the overhead of
+/// establishing new gRPC connections for each partition fetch during shuffle 
reads.
+/// Connections are stored indefinitely and reused across multiple fetch 
operations.
+///
+/// # Thread Safety
+///
+/// The pool uses a `RwLock` to allow concurrent reads while ensuring exclusive
+/// access during connection creation. The `BallistaClient` itself is `Clone`
+/// (wrapping an `Arc`), so cloned clients share the underlying connection.
+#[derive(Default)]
+pub struct BallistaClientPool {

Review Comment:
   just wonder would it make sense to use some off-the-shelf like `deadpool` or 
`r2d2` instead of implementing our own ?
   
   If i'm not mistaken, current implementation can, in theory, leak 
connections; connection will be removed from the pool only when it fails, but, 
in theory, we can have a executor removed/replaced and connection never get 
chance to fail 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to