2010/3/26 Niklas Gustavsson <nik...@protocol7.com>:
> On Fri, Mar 26, 2010 at 9:50 AM, Fred Moore <fred.moor...@gmail.com> wrote:
>> 1\ Priority of passive port sharing ehnancement: Niklas survey shows that we
>> are indeed in good company here, but it's problably worth having a better
>> look at this anyway, there might be good technical reasons that led all the
>> other teams not to support this or it may turn up that it's "simply" because
>> it's somewhat hard to develop and test.
>
> After this discussion I'm significantly less thrilled at implementing
> shared passive ports :-)

Shared passive ports would be a nice feature if they aren't too hard
to implement. Among the opensource servers, I think coloradoFTP -a
NIO-based java FTPServer under the LGPL license- offered this (since
their data connections also use async sockets this shouldn't be too
hard for them, but I don't know if they solved the use case depicted
by Sai: when there are several sessions open from the same IP)  but it
seems that commercial solutions offer this and more...



>> 2\ Quick fix for 1.0.x codebase: pushing a 40x to the client  when no
>> passive port is available (or probably better: no passive port is available
>> within X seconds) it's probably something we need to do anyway.
>
> Thinking some more about this, I'm personally now convinced that
> should simple return an error (not waiting). I'm not sure what the
> best reply code should be, but "425 Can't open data connection" seems
> fitting although not specified as valid return from the PASV command.
>
>> 3\ Suspect race condition: the problem description for the originally
>> reported http://issues.apache.org/jira/browse/FTPSERVER-359 (see also repro
>> code attached to the jira) actually hints also to something different as
>> well, in fact we state that a few (say 20) parallel threads issuing LISTs in
>> passive mode are able to "lock-up" the server forever. Questions:
>>
>> 3.1\ Is this interely explained by this thread discussion? (I don't think
>> so: the server should *always* be able to recover)
>
> Agreed, the server should always recover from a situation like this.
> After looking into how to fix item 2, we need to rerun your tests and
> make sure we always survive.

Thinking about this issue my understanding of the problem is as follows:

1. We have a number of connections to FTPServer >  the Executor
threadpool max  size (I think it is 16) sending  the PASV command.

2. The first one of them requests the only available port and gets it.
Now the port is in use by a server socket and any subsequent call to
requestPassivePort will end up invoking wait().

3. The thread that processed this PASV command is now available and a
new PASV request is assigned to it.

4. Now all threads are trying to request a passive port, but since
there are no ports available  all the threads in the OrderedThreadPool
get blocked by the wait() method.

I wonder if we are suffering a similar problem in any other cases; if
it was so, we might need to delay the opening of the ServerSocket
until the LIST (or GET or PUT...) commands are executed.

I hope I made myself clear and that my understanding was right.


>> 3.2\ Would this be fixed by a quick fix as per 2\? (likely, but it's sort of
>> like using nukes to for mowing the lawn)
>
> I really have no idea, but I think we should fix 2 first and then make
> sure we handle your test case.
>
>> In short my current position can be stated as follows: I think that
>> FTPSERVER-359 has a different root cause from what we discussed, the problem
>>  impact is not completely known at the moment but it appears to *severely*
>> affect the server availabily... having just one port is an easy way of
>> reproducing it (but not the cause of it).
>
> Agreed.
>
> /niklas
>

Reply via email to