[ 
https://issues.apache.org/jira/browse/HTTPCORE-589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905548#comment-16905548
 ] 

Linton Miller commented on HTTPCORE-589:
----------------------------------------

I'm certainly open to alternatives, but I couldn't see a better choice when I 
was working it.

I thought I'd done OK at not being too intrusive or expensive with the fix :) 
Basically all it amounts to is when allocating a connection, wrapping the 
modification of the pool state in something that allows us to detect if there 
was a concurrent release. If so, after we've finished modifying the pool, if we 
end up pended, we should immediately check to see if the release allows us to 
fulfill our pend.

I achieve the detection of a concurrent release by using a Long that's simply 
incremented with each release. Thus if the long isn't the same when you finish 
connection allocation as when you started, you know there was a concurrent 
release done during your allocation and you should check pending.

So fundamentally the change is just when allocating a connection (in lease and 
in servicePendRequest) 

 
{code:java}
  final long releaseState = releaseSeqNum.get();


  ... try connection allocation ...
  <if allocation fails>
      <pending.add>

      if (releaseState != releaseSeqNum.get()) {
         servicePendingRequest();
      }{code}
 

And when releasing a connection (in release and enumAvailable) :

 
{code:java}
  .... release connection and update pool state ...
  releaseSeqNum.incrementAndGet(); {code}

The rest of the mods are just minor code tweaks and simplifications I made 
while getting this in place. e.g. cleaning up the allocation code so the 
allocation process reads more straight-forwardly:

 
{code:java}
    PoolEntry<T, C> entry = getAvailableEntry(state);
    if (entry == null && pending.isEmpty()) {
        entry = createPoolEntry();
    }
{code}
is a concise and tidy expression of allocating a pool entry, separate from what 
is then done with that pool entry. Creating a pool entry is completely handled 
in one abstracted location rather than splitting the allocation and actual 
creation.

And then the actions for assigning that allocated entry are only in a single 
place, with linear code flow:

 
{code:java}
    if (entry != null) {
        addLeased(entry);
        future.completed(entry);
    }
{code}
rather than copied and repeated for each of the different ways a pool entry 
might be obtained.

> LaxConnPool may not allocate available connection
> -------------------------------------------------
>
>                 Key: HTTPCORE-589
>                 URL: https://issues.apache.org/jira/browse/HTTPCORE-589
>             Project: HttpComponents HttpCore
>          Issue Type: Bug
>          Components: HttpCore
>    Affects Versions: 5.0-beta8
>            Reporter: Linton Miller
>            Priority: Minor
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The LaxConnPool can fail to allocate available connections from the pool in 
> rare circumstances.
> The issue happens when 2 threads execute concurrently; one requesting a lease 
> while another releases a pool entry.
> Thread 1
> future = lease
>   getAvailableEntry = null
>   allocatePoolEntry = false
>   <about to add a LeaseRequest to pending queue>
> Thread 2
> release
>   removeLeased
>   available.add
>   servicePendingRequest
>     while (pending.poll() != null)  does nothing because pending queue is 
> empty
>   <release call completed, with an entry now in the available list>
> Thread 1
>   pending.add(new LeaseRequest())
>   <lease call completed, with an unsatisfied future>
> future.get()
> And now Thread 1 is blocked until there is another release call, even though 
> there is actually an available connection in the pool.
> The problem arises because there is a window between checking for an 
> available pool entry and registering on the pending request queue. In that 
> window, the statte of the pool can change.
> Recreating this problem is test is extremely difficult, because the window 
> for failure is really very small. I've only be able to observe the error in 
> running code by introducing synchronization or delay (as small as 3ms, but 
> still needed) into the pool code just before the lease pending.add, so as to 
> ensure the necessary timing condition between threads exist. My test class:
> {code:java}
> import org.apache.hc.core5.io.ModalCloseable;
> import org.apache.hc.core5.io.CloseMode;
> import org.apache.hc.core5.pool.PoolEntry;
> import org.apache.hc.core5.pool.LaxConnPool;
> import org.apache.hc.core5.pool.PoolEntry;
> import org.apache.hc.core5.util.TimeValue;
> import java.util.concurrent.Future;
> public class PoolPendTest {
>   private static class TestConn implements ModalCloseable {
>     public void close() {
>     }
>     public void close(CloseMode mode) {
>     }
>   }
>   private static LaxConnPool<String,TestConn> pool;
>   private static void dbg(String msg) {
>     System.out.println(Thread.currentThread().getName()+": 
> "+System.currentTimeMillis()+": "+msg);
>   }
>   public static void main(String[] args) {
>     final int poolSize = 1;
>     pool = new FixPendPool<>(poolSize);
>     // Create a pool entry purely to prime class loading
>     new PoolEntry<>("", TimeValue.NEG_ONE_MILLISECONDS);
>     
>     for (int i=0;i<poolSize*2;i++) {
>       new Thread(String.format("req-%04d", i)) {
>         @Override
>         public void run() {
>           dbg("Started");
>           Future<PoolEntry<String,TestConn>> req = 
> pool.lease("somehost",null);
>           PoolEntry<String,TestConn> entry = null;
>           if (req.isDone()) {
>             try {
>               entry = req.get();
>               if (!entry.hasConnection()) {
>                 entry.assignConnection(new TestConn());
>                 pool.release(entry, true);
>                 dbg("Released pool entry "+System.identityHashCode(entry));
>               }
>             }
>             catch (Exception shouldNotHappen) {
>               dbg("Should not happen!");
>               shouldNotHappen.printStackTrace();
>             }
>           }
>           try {
>             dbg("Assigned pool entry "+System.identityHashCode(req.get()));
>           }
>           catch (Exception fail) {
>             dbg("Getting pool entry failed");
>             fail.printStackTrace();
>           }
>         }
>       }.start();
>     }
>   }
> }
> {code}
> When I modify the LaxConnPool code to add a 3ms delay just before adding to 
> the pend queue:
> {code:java}
>     try { Thread.sleep(3); } catch (InterruptedException onward) {}
>     pending.add(new LeaseRequest<>(state, requestTimeout, future));
> {code}
> then the test class blocks with an infinite wait where the pool contains a 
> returned connection, but it's never assigned to the second thread.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to