On Fri, 17 Sep 2010, Thomas Rauscher wrote:
(First, I'm sorry its taken me this long to respond...)
I think I've found a problem that occurs when writing to the send socket
returns -1 (EGAIN). Additional preconditions to trigger the problem are
writing in larger chunks than the advertized window size, e.g. 128k writes
vs. 12k window size.
I don't quite understand your problem. I'll add my questions and thoughts
inline below.
The remote side is a dropbear SSH server which seems use 12k window size
increments. This means that packets need to be split very often. If
additionally the socket buffer gets full, the saved packet is never sent.
A workaround is to use smaller writes (1k), but this only hides the problem.
Details:
1) The application calls _libssh2_channel_write(..., 128*1024);
First, _libssh2_channel_write() will internally ignore everything that is
larger than 32768 bytes. It will only try to send the first 32768 bytes in
each function invoke.
The function will/should then make sure that it doesn't try to send any more
data than the remote has a window for. In this case, it should further
decrease the amount of data this function will attempt to send.
* In _libssh2_transport_write()
_libssh2_send returns -1 (EAGAIN) and the current packet is saved to
p->odata, p->olen ...
You mean that it returns EAGAIN immediately or after having sent the first 12K
of data? I assume you mean that it first sends some data and then when it
loops it gets EAGAIN back.
* _libssh2_transport_write() returns LIBSSH2_ERROR_EAGAIN to
_libssh2_channel_write() which executes
if(wrote) {
_libssh2_transport_drain(session);
goto _channel_write_done;
}
... as it would only execute that if 'wrote' actually wasn't zero.
_libssh2_transport_drain() frees p->outbuf and sets it to NULL.
* _libssh2_transport_write then returns "wrote" (12k) to the application.
Right, as it did in fact successfully send away 12K.
2) Application calls _libssh2_channel_write(..., 128*1024) again.
Right, but that buffer should now be pointing 12K further into the data as 12K
was in fact sent in the previous invoke.
_libssh2_transport_write() now calls send_existing() first which
immediately returns because p->outbuf is NULL.
if (!p->outbuf) {
*ret = 0;
return LIBSSH2_ERROR_NONE;
}
Right, there's nothing save there. What do you think it should have saved
there?
* This results in not sending the saved packet, but sending the next packet.
The SSH server then bails out and terminates the connection (saying "bad
packet size").
... as you can see I didn't follow how it ended up like this! I'll get myself
a dropbear install and see if I can repeat this. Is uploading data with a 128K
buffer enough to trigger it? Like with the sftp_write_nonblock.c example?
--
/ daniel.haxx.se
_______________________________________________
libssh2-devel http://cool.haxx.se/cgi-bin/mailman/listinfo/libssh2-devel