Re: [Fwd: RE: ssh problem on Windows XP]

2005-01-22 Thread Bob Byrnes
On Jan 21,  6:34pm, Corinna Vinschen wrote:
-- Subject: [Fwd: RE: ssh problem on Windows XP]

 is there any chance that we get a fix in the next couple of weeks?

I remain absolutely committed to fixing the problems that have been
reported, but I can't say that I'll have a fix in that timeframe,
because I have some urgent deadlines for other projects.  Maybe
early to mid-February?

 If we don't get a patch, I'm inclined to revert the pipe patch before
 we release 1.5.13.

Instead of reverting the entire patch, if you want to restore the old
behavior (select always returning true for writes on pipes), you could
add a small piece of code to short-circuit the NtQueryInformationFile
logic that I added.

That would make it much easier for me to apply my fix when it's available,
because I could just remove the short-circuit when I add a test to
detect the problem, which I think I understand completely, and have
described in an earlier posting: NtQueryInformationFile acts strangely
when there is a pending, blocking read on the other end of the pipe.
I need time to finish prototyping the new test, however.

 Btw., didn't you announce more pipe patches yet to come?  Is it possible
 that you already have a patch which will get that working again?  I'm
 still hoping for something more satisfying than reverting...
 
-- End of excerpt from Corinna Vinschen

Yes, I have more patches, but they don't fix the outstanding problem
with the first patch (I would have certainly sent the fix if I had one).
All of my fixes are related to detecting and avoiding deadlocks, and I
have some that are not pipe related.

In case anyone is curious, let me relate the story of what started all
of this ...

At Curl (where I work), we have a pool of about a dozen Windows servers
that we use for automated builds of our products.  Each build starts by
rsync-ing the sources from a Linux server to a Windows build server (over
ssh), then we launch make via ssh and collect std{out,err} over an ssh
channel, and finally the build finishes by rsync-ing the build tree back
from the Windows build server to the Linux build server, again over ssh.

Our Windows build servers are in almost continuous use.  As you
can imagine, this setup acts as a severe stress test for Cygwin.
Unfortunately, last year Cygwin deadlocks were killing our productivity:
at least 25% of our builds were wedging, which was completely
unacceptable.

So, I rolled up my sleeves, installed a Cygwin DLL with all of the
debugging symbols, and went to work investigating and fixing each deadlock
that I encountered.

It soon became apparent that pipes were a major problem.  We observed
deadlocks because select for writes on pipes always returned true,
so I implemented a fix (that's the first patch).  Similar deadlocks
occurred because nonblocking writes on pipes could block, so I added
an implementation for nonblocking writes too (a second patch, which I
submitted, but was never applied, because we wanted to investigate the
reported problems with the first patch).

Later, I found that Cygwin is burned by unfortunate winsock behavior
that we had already encountered in other contexts: it sometimes assigns a
local port that can't actually be used immediately, because it is still
in TIME_WAIT, so connect fails with EADDRINUSE.  I fixed one nasty case
where this phenomenon can cause a missed notification in the code for
select on sockets (a third patch), and another that caused socketpair
to fail sporadically (a fourth patch).

The improvement in Cygwin's behavior with these four patches has been
dramatic at Curl.  Our builds almost never experience deadlocks now.  I am
eager to contribute them to the rest of the world, but I recognize that
I need to fix the first patch before we apply the rest, and I will do it.

Our patches have been extensively tested, but we missed the problem
that occurs for pending, nonblocking reads, because our automated builds
don't use commands like sftp, unison, etc.  Most of the other commands
seem to use nonblocking I/O on pipes (often with select), and that works
with my patches.

--
Bob


Re: More pipe problems (was Re: [Fwd: 1.5.11-1: sftp performance problem])

2004-10-05 Thread Bob Byrnes
I've been working on this, and I think I now understand the underlying
problem, but I don't yet have a fix.

The problem is that NtQueryInformationFile is stranger than I ever imagined.

First, some background, to be sure we're all speaking the same language:
Windows pipes actually seem more similar to UNIX domain sockets than
POSIX pipes.  In particular, Windows pipes have a server side and a
client side.

Most of the Windows pipe terminology is from the standpoint of the server.
An inbound pipe transfers data only in the client-server direction, an
outbound pipe transfers server-client, and a duplex pipe transfers
both ways.

We use CreateNamedPipe to make an inbound pipe for reading: this is the
server side.  Only the input buffer size is relevant for such a pipe ...
we should probably set the output buffer size to zero to emphasize that it
is not used.  We currently set both input and output buffer sizes to the
same value, which is harmless but misleading.

We then use CreateFile to obtain a handle to the other (client) side of
the pipe for writing, with attributes to allow NtQueryInformationFile
to work.

When we call NtQueryInformationFile with the FilePipeLocalInformation
parameter (FILE_INFORMATION_CLASS), it fills in a FILE_PIPE_LOCAL_INFORMATION
struct, which contains the following fields (among others that are not
directly related to pipe buffering):

  ULONG InboundQuota;
  ULONG ReadDataAvailable;
  ULONG OutboundQuota;
  ULONG WriteQuotaAvailable;

The InboundQuota and OutboundQuota are the input and output buffer sizes,
respectively, that were specified for CreateNamedPipe.  For our pipe
configuration, therefore, the OutboundQuota is irrelevant: we should be
checking the InboundQuota when we are interested in the pipe size.  This
does not currently affect us because we set InboundQuota == OutboundQuota,
but we should fix it.

The ReadDataAvailable field is always zero on the client side of an inbound
pipe.  We're ignoring it, as we should be.

Our code assumes that the WriteQuotaAvailable field tells how much space is
available for writing.  Well, not quite.

For an empty pipe, WriteQuotaAvailable == InboundQuota.  When data has
been written to the pipe (but not yet read by the other side), then
WriteQuotaAvailable is decremented by the appropriate amount, until
the pipe is full, when WriteQuotaAvailable is zero.  WriteFile blocks
only when it tries to write more than WriteQuotaAvailable.  All of this
is normal and expected.

But there is a strange twist:  When a read is pending on an empty pipe,
then WriteQuotaAvailable is also decremented!  I can't imagine why this
would be the case, but it is easy to demonstrate using a pair of small
test programs that I wrote to experiment with pipe buffering.

This behavior is unfortunate, because it means that we aren't currently
distinguishing pending reads and writes.  However, now that I understand
what's really going on, I can focus on working around it.

We have been baffled by the apparent contradiction that was nicely
summarized by Corinna:

 Do I understand that right?  sftp is in the blocking read on the pipe,
 there is data in the pipe and nevertheless read doesn't return?  That's odd.

ReadFile only blocks when the pipe is completely empty.  If there is
any data in the pipe, then it returns immediately, possibly supplying
less data than was requested.

When ReadFile blocks, there really isn't any data in the pipe.  Our select
code is just confused by the decremented WriteQuotaAvailable field, because
it looks like a *write* is pending!  So we start to throttle the writes,
which causes the data transfer to slow down, or even to block (if the
requested read was large enough).

I think my speculations about POSIX atomicity requirements related to
PIPE_BUF were a red herring.  Ditto for hypothetical bizarre pipe
buffering behavior.  The real problem here is NtQueryInformationFile.

I want to first try reversing the direction of the pipe.  We don't really
care which end is the server or client, so we could easily modify our pipe
creation code to make an outbound pipe, and if WriteQuotaAvailable from
NtQueryInformationFile behaves sanely, then we're done.

If that doesn't work, then I have some other ideas about how to distinguish
pending reads and writes.

I hope to have more info to report soon.

--
Bob


Re: [Fwd: 1.5.11-1: sftp performance problem]

2004-09-13 Thread Bob Byrnes
On Sep 12,  4:42pm, Corinna Vinschen wrote:
-- Subject: Re: [Fwd: 1.5.11-1: sftp performance problem]

 Do I understand that right?  sftp is in the blocking read on the pipe,
 there is data in the pipe and nevertheless read doesn't return?  That's
 odd.

Yes, very.  I'm still experimenting, so the following description
should be considered tentative:

ReadFile seems to want to return the number of bytes requested, not
a partial amount, at least the way we're using it with pipes now.

Bytes in the pipe seem to be buffered on the write side until there
are enough to satisfy the read, or the pipe fills (or maybe there is
some other high-water mark, but it is very close to full).  The system
also seems to empty the buffer after a certain amount of time ... I'm
not sure about the interval, but it's fairly long.

This all works most of the time.  But it interacts badly with the
POSIX atomic write requirements related to PIPE_BUF.  In particular,
select should say that a pipe is not writable when there is  PIPE_BUF
space available (and our implementation does this).

So, when sftp does a large blocking read (at least 16k, which is the
size of the pipe buffer), and ssh almost fills the pipe, then select
stops further writes from happening, and we only deliver data when
the timer expires, which is why transfers are so slow.

This also explains why the problem disappears when we use sftp -B to
reduce the size of the reads, because the data is delivered before
the pipe fills to within PIPE_BUF bytes.  And similarly when I increase
the size of the pipe buffer.

Most other programs that use select or nonblocking I/O aren't affected,
because they only try to read whatever is already in the pipe, and
they get the data immediately.

 | I guess this means that local
 | pipes always do buffering as described in the previous paragraph, and
 | this can't be disabled using FILE_FLAG_WRITE_THROUGH.
 
 Did you try that?

I haven't yet, but I will.  Disabling buffering would fix the problem.
Or if we could somehow control the buffering parameters (the high-water
mark or the timer), that would also probably be sufficient.  In particular,
setting the high-water mark to reserve PIPE_BUF bytes would be perfect.
I want to try FlushFileBuffers too, but we can't let it block (hmmm ...
maybe in a separate thread?).

The real problem is that NtQueryInformationFile returns information
about the data buffered on the write side of the pipe, and that
doesn't account for the possibility that a large read might already
be blocked on the other side, which effectively means that more space
is actually available.  If we could determine the amount of requested
data for the read, that would help.  The system must know, since it
seems to affect the buffering algorithm, but I don't know offhand how
to get at the info.

 Dunno if that's a *better* idea, but would it be reasonable to try changing
 pipes to use overlapped I/O?

Maybe, but that seems complicated.  I'm hoping for something simpler.

   Or what if a read from a pipe always asks for
 the number of available bytes using NtQueryInformationFile and then only
 actually reads this number of bytes and returns that to the caller?
 
-- End of excerpt from Corinna Vinschen

That's similar to nonblocking reads, and it might work, and I guess for
blocking reads we would never attempt to read more than the size of the
pipe minus PIPE_BUF.  If we don't want to return partial data, then we
could redo reads until all of the data is delivered to the calling
program, but I think partial reads should be OK.

On the read side, we'd use PeekNamedPipe instead of NtQueryInformationFile.

--
Bob


Re: [Fwd: 1.5.11-1: sftp performance problem]

2004-09-11 Thread Bob Byrnes
On Sep 11,  3:02pm, Corinna Vinschen wrote:
-- Subject: Re: [Fwd: 1.5.11-1: sftp performance problem]

 http://cygwin.com/acronyms/#PCYMTNQREAIYR

Sorry ... done.

 | One really annoying consequence of socketpairs for sshd is that
 | Windows-native (i.e. non-Cygwin) programs don't know how to write
 | directly to sockets on stdout (or stderr), so if you try to use
 | them via ssh, their output silently disappears.
 
 Oh lord!  I tried it with `net', `ping' and `nslookup', which are roughly
 the only Windows applications I use on the command line, and all three
 have no problems.  I'm wondering how the above can happen since the
 interactive application side is running in a PTY (which *is* a pipe),
 not on socketpairs.

Try it without a pty:

ssh cygwin-system-with-sshd-using-sockpairs win32-native-command

vs.

ssh cygwin-system-with-sshd-using-sockpairs win32-native-command | cat

 | I think OpenSSH uses pipes on most platforms by default, so I'm also
 
 Only on those platforms which have known problems with (or no implementation
 of) socketpairs.

OK, maybe I am misremembering, or the default changed ages ago.
I see that sshd on Linux uses socketpairs.

 | somewhat concerned that the socketpair-specific code in OpenSSH might be
 | a bit crufty.
 
 Why?  Socketpairs are in the same shape as any other socket connection
 since they are using the same code.  And I surely do care for socket
 operations.  See the ChangeLogs.  What especially do you think is crufty?

-- End of excerpt from Corinna Vinschen

My concern was about the state of the OpenSSH code base, not Cygwin, and
it was just a vague fear, based on the previous (I guess now incorrect)
[mis]understanding that OpenSSH prefers to use pipes.

 | It's not obvious to me why socketpairs would be inherently faster
 | than pipes.

It occurred to me after I wrote this that if no pipes were used at all,
then the select_pipe thread would not be spawned ... the select_socket
thread would handle everything.  This might be win, but we'd need to
measure to see if the effect is significant.

And we'd still have the problem with win32-native apps, which I think
is a show-stopper for socketpairs.

--
Bob


Re: [Fwd: 1.5.11-1: sftp performance problem]

2004-09-10 Thread Bob Byrnes
On Sep 10, 11:55am, [EMAIL PROTECTED] (Bob Byrnes) wrote:
-- Subject: Re: [Fwd: 1.5.11-1: sftp performance problem]

  ...  I don't use sftp much, but we pump *enormous* amounts
 of data through sshd otherwise, so it's odd that I haven't noticed any
 performance impact.

Interesting.  The problem seems to affect *only* sftp, and (as Corinna
noted) only inbound transfers.  Outbound sftp works fine, as do scp
transfers in both directions, and data sent through ssh in general.

This explains why I haven't noticed the problem; I generally use scp
instead of sftp, because the latter requires SSH protocol 2, and I need
to deal with a variety of machines, some of which still use the older
SSH protocol 1.

 I guess it's possible that my patch interacts badly with some other
 recent submissions; I did most of my testing with an earlier snapshot.
 
-- End of excerpt from Bob Byrnes

I can reproduce the problem with the earlier snapshot too, so it isn't
caused by anything recent.

I'll do some stracing to see what sftp is doing (that scp and ssh are not).

--
Bob


Re: [Patch] implementation of nonblocking writes on pipes

2004-09-09 Thread Bob Byrnes
On Sep 9, 10:06am, [EMAIL PROTECTED] (Christopher Faylor) wrote:
-- Subject: Re: [Patch] implementation of nonblocking writes on pipes

 Before we start adding more patches which are based on your previous work,
 could you reply to some of the problems raised in the cygwin mailing list?

Sure, I'll do that now (I've fallen a few days behind reading that list).

 There was one problem with Windows 95 which Corinna fixed but now there
 is another problem with using rsync, which I thought was one of the
 impetuses for your patch.

-- End of excerpt from Christopher Faylor

The win95 problem was unfortunate ... I don't have a win95 system here
to test.  I'll look more closely at the patch, but I think it is correct.

The rsync problem (I assume you are talking about rsync + xp sp2 failing)
appears to be unrelated to my patch, because I think the failure is reported
for file desriptors from a socketpair, not a pipe.  I'll try to confirm that.

In each case, I'll send a follow up message to the cygwin mailing list.

It's important to realize that my first patch does not fix all hangs
related to rsync or ssh.  I've identified and fixed at least three
distinct deadlock scenarios, and I'm submitting separate patches for each.

--
Bob


[Patch] implementation of select for write on pipes

2004-09-02 Thread Bob Byrnes
The following patch implements select for write on pipes.  Currently,
pipes always select writable, which sometimes causes programs like
rsync to hang when the pipe is full.  This can be observed directly from
strace output:

http://www.mail-archive.com/[EMAIL PROTECTED]/msg07559.html

We have been using this patch successfully for several months for
automated builds of our Windows products using Cygwin, so it has been
extensively tested.  It is the first of several patches that I plan to
contribute, all with the goal of preventing Cygwin hangs, and it does
seem to really improve the behavior of programs like rsync (and sshd)
that use select extensively with pipes.

The existing code uses PeekNamedPipe to implement select for read on
pipes.  For the corresponding write test, I used NtQueryInformationFile to
fill in a FILE_PIPE_LOCAL_INFORMATION struct, which contains OutboundQuota
and WriteQuotaAvailable fields that allow us to detect when the pipe is
full.  I fixed the signature and return value for NtQueryInformationFile
as well.

Unfortunately, NtQueryInformationFile isn't supported by Win9x, so we
optimistically assume the pipe is always writable on those systems (i.e.,
no change from the status quo).

NtQueryInformationFile also requires FILE_READ_ATTRIBUTES access on the
write side of the pipe, but CreatePipe doesn't set that (at least before
WinXP SP2), so I added a new create_selectable_pipe function to do the
right thing: it uses CreateNamedPipe for the read side, and CreateFile
for the write side (with FILE_READ_ATTRIBUTES).  It's important to only
allow a single pipe instance, to ensure that the named pipe was not
created earlier by some other process.  We retry if necessary until
we're sure that we really have a new pipe (unlikely to ever happen,
but it's nice to be bulletproof).

If we somehow inherit a (non-Cygwin) pipe without FILE_READ_ATTRIBUTES,
then NtQueryInformationFile will fail, and we again optimistically assume
that the pipe is writable in this case.

We also have to deal with Win9x, which doesn't support CreateNamedPipe,
so we just call CreatePipe on those systems to use an anonymous pipe as
the best approximation.

(As an aside, I couldn't help noticing that CreatePipe stupidly creates
a full duplex pipe, which is a waste, since only a single direction is
actually used.  We avoid that silliness with our named pipes.)

For a pipe to select writable, it must have at least PIPE_BUF bytes
available, to satisfy POSIX atomic write requirements.  Subsequent writes
with size  PIPE_BUF can still block, but most (all?) UNIX variants seem
to work this way.

For experimenting, we use the CYGWIN_PIPEBUFSIZE environment variable to
set the buffer size for pipes, with 4*PIPE_BUF == 16k as the (unchanged)
default.  We refuse to create a tiny pipe with size  PIPE_BUF, however,
to ensure that there will always be enough space for POSIX atomic writes.
If we somehow inherit a tiny (non-Cygwin) pipe, then we consider the
pipe writable only if it is completely empty, to minimize the probability
that a subsequent write will block.

--
Bob Byrnes
Curl Corporation
1 Cambridge Center, 10th Floor
Cambridge, MA 02142-1612



2004-09-02  Bob Byrnes  [EMAIL PROTECTED]

* autoload.cc (NtQueryInformationFile): Return nonzero on error.
* ntdll.h (FILE_PIPE_LOCAL_INFORMATION): Add.
(NtQueryInformationFile): Fix types for last two arguments.
* pipe.cc: Include stdlib.h, stdio.h, limits.h, and ntdll.h.
(create_selectable_pipe): New function to create a pipe that can be
used with NtQueryInformationFile for select.
(fhandler_pipe::create): Call create_selectable_pipe instead of
CreatePipe.
(pipe): Use CYGWIN_PIPEBUFSIZE environment variable if defined, or
DEFAULT_PIPEBUFSIZE otherwise.
* select.cc: Include limits.h and ntdll.h.
(peek_pipe): Add select_printf output.  Call NtQueryInformationFile to
implement select for write on pipes.
(fhandler_pipe::select_read): Reorder field assignments to be
consistent with fhandler_pipe::select_write.
(fhandler_pipe::select_write): Initialize startup, verify, cleanup, and
write_ready fields for select_record.
(fhandler_pipe::select_except): Tweak indentation to be consistent with
fhandler_pipe::select_write.



Index: autoload.cc
===
RCS file: /cvs/src/src/winsup/cygwin/autoload.cc,v
retrieving revision 1.86
diff -u -p -r1.86 autoload.cc
--- autoload.cc 24 Jul 2004 09:41:34 -  1.86
+++ autoload.cc 2 Sep 2004 22:06:11 -
@@ -379,7 +379,7 @@ LoadDLLfuncEx (NtCreateToken, 52, ntdll,
 LoadDLLfuncEx (NtMapViewOfSection, 40, ntdll, 1)
 LoadDLLfuncEx (NtOpenFile, 24, ntdll, 1)
 LoadDLLfuncEx (NtOpenSection, 12, ntdll, 1)
-LoadDLLfuncEx (NtQueryInformationFile, 20, ntdll, 1