On Fri, 17 Sep 2010, Thomas Rauscher wrote:

(First, I'm sorry its taken me this long to respond...)

I think I've found a problem that occurs when writing to the send socket returns -1 (EGAIN). Additional preconditions to trigger the problem are writing in larger chunks than the advertized window size, e.g. 128k writes vs. 12k window size.

I don't quite understand your problem. I'll add my questions and thoughts inline below.

The remote side is a dropbear SSH server which seems use 12k window size increments. This means that packets need to be split very often. If additionally the socket buffer gets full, the saved packet is never sent.

A workaround is to use smaller writes (1k), but this only hides the problem.

Details:

1) The application calls _libssh2_channel_write(..., 128*1024);

First, _libssh2_channel_write() will internally ignore everything that is larger than 32768 bytes. It will only try to send the first 32768 bytes in each function invoke.

The function will/should then make sure that it doesn't try to send any more data than the remote has a window for. In this case, it should further decrease the amount of data this function will attempt to send.

 * In _libssh2_transport_write()

_libssh2_send returns -1 (EAGAIN) and the current packet is saved to p->odata, p->olen ...

You mean that it returns EAGAIN immediately or after having sent the first 12K of data? I assume you mean that it first sends some data and then when it loops it gets EAGAIN back.

* _libssh2_transport_write() returns LIBSSH2_ERROR_EAGAIN to _libssh2_channel_write() which executes

   if(wrote) {
     _libssh2_transport_drain(session);
     goto _channel_write_done;
   }

... as it would only execute that if 'wrote' actually wasn't zero.

   _libssh2_transport_drain() frees p->outbuf and sets it to NULL.

 * _libssh2_transport_write then returns "wrote" (12k) to the application.

Right, as it did in fact successfully send away 12K.

2) Application calls _libssh2_channel_write(..., 128*1024) again.

Right, but that buffer should now be pointing 12K further into the data as 12K was in fact sent in the previous invoke.

_libssh2_transport_write() now calls send_existing() first which immediately returns because p->outbuf is NULL.

 if (!p->outbuf) {
   *ret = 0;
   return LIBSSH2_ERROR_NONE;
 }

Right, there's nothing save there. What do you think it should have saved there?

* This results in not sending the saved packet, but sending the next packet. The SSH server then bails out and terminates the connection (saying "bad packet size").

... as you can see I didn't follow how it ended up like this! I'll get myself a dropbear install and see if I can repeat this. Is uploading data with a 128K buffer enough to trigger it? Like with the sftp_write_nonblock.c example?

--

 / daniel.haxx.se
_______________________________________________
libssh2-devel http://cool.haxx.se/cgi-bin/mailman/listinfo/libssh2-devel

Reply via email to