On Tue, Oct 29, 2019 at 03:03:43AM +0100, Bram Moolenaar wrote:
>
> Paul Ackersviller wrote:
>
> > > > With athena gui version on AIX, vim will consistently go into an
> > > > infinite loop if the network connection drops. Trussing such a process
> > > > points to a select system call, so I found this one without any check on
> > > > the return value. This patch mostly prevents the problem, although
> > > > not quite 100% of the time.
> > >
> > > It looks like this code depends on undocumented or system-specific
> > > behavior. At least for what I could find poll() and select() called
> > > with no file descriptors will always wait until the timeout and then
> > > return zero. Do you have documentation about when the error code would
> > > be returned?
> >
> > I can pass on man pages if you want to know possible errno values, but
> > that won't help with EINTR, as no OSes I use have that behaviour. I put
> > that check in only to mimic how vim is already handling select() errors
> > elsewhere, i.e. in RealWaitForChar() also in os_unix.c, as well as
> > can_write_buf_line() in the channel.c file.
>
> Not errno values, but just why it would return an error at all. It's
> documented that select() without any file descriptors can be used to
> wait with sub-second accuracy, for systems that don't have usleep(). But
> nowhere does it say it returns any error.
I'm attaching the system's select man page, and it looks like EINTR is
about the only candidate in this situation. I'd say it's ambiguous
whether an error could happen waiting on a timeout, it not mentioned.
> > > Also, I don't see how a hang can occur here when poll() or select()
> > > returns without waiting. Vim would simply continue. Or is the delay
> > > critical in some situation?
> >
> > Yes, continues infinitely, which is the issue... chewing up 100% of a CPU
> > until killed. Not sure how you got th idea of a hang.
>
> Where does it loop then? The place where you have the change doesn't
> loop, it returns.
You've got me wondering if the loop isn't really in Athena or X, and the
change I did is affecting timing somehow... I'll let you know if I get
anywhere, and thanks for your attention.
--
--
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php
---
You received this message because you are subscribed to the Google Groups
"vim_dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/vim_dev/20191030015625.GE24639%40ma.sdf.org.
select Subroutine
Purpose
Checks the I/O status of multiple file descriptors and message queues.
Library
Standard C Library (libc.a)
Syntax
#include <sys/time.h>
#include <sys/select.h>
#include <sys/types.h>
int select (Nfdsmsgs, ReadList, WriteList, ExceptList, TimeOut)
int Nfdsmsgs;
struct sellist * ReadList, *WriteList, *ExceptList;
struct timeval * TimeOut;
Description
The select subroutine checks the specified file descriptors and message
queues to
see if they are ready for reading (receiving) or writing (sending), or
if they
have an exceptional condition pending.
When selecting on an unconnected stream socket, select returns when the
connection is made. If selecting on a connected stream socket, then the
ready
message indicates that data can be sent or received. Files descriptors
of regular
files always select true for read, write, and exception conditions. For
more
information on sockets, refer to "Understanding Socket Connections" and
the
related "Checking for Pending Connections Example Program" dealing with
pending
connections in AIX Version 6.1 Communications Programming Concepts.
The select subroutine is also supported for compatibility with previous
releases
of this operating system and with BSD systems.
On shared memory descriptors, the select subroutine returns true.
Note: If selecting on a non-blocking socket for both read and write
events and if
the destination host is unreachable, select could show a different
behavior due
to timing constraints. Refer to the Examples section of this document
for further
information..
Parameters
Item
Description
Nfdsmsgs
Specifies the number of file descriptors and the number of message
queues to
check. The low-order 16 bits give the length of a bit mask that
specifies
which file descriptors to check; the high-order 16 bits give the
size of an
array that contains message queue identifiers. If either half of the
Nfdsmsgs parameter is equal to a value of 0, the corresponding bit
mask or
array is assumed not to be present.
TimeOut
Specifies either a null pointer or a pointer to a timeval structure
that
specifies the maximum length of time to wait for at least one of the
selection criteria to be met. The timeval structure is defined in
the
/usr/include/sys/time.h file and it contains the following members:
struct timeval {
int tv_sec; /* seconds */
int tv_usec; /* microseconds */
};
The number of microseconds specified in TimeOut.tv_usec, a value
from 0 to
999999, is set to one millisecond if the process does not have root
user
authority and the value is less than one millisecond.
If the TimeOut parameter is a null pointer, the select subroutine
waits
indefinitely, until at least one of the selection criteria is met.
If the
TimeOut parameter points to a timeval structure that contains
zeros, the
file and message queue status is polled, and the select subroutine
returns
immediately.
ReadList, WriteList, ExceptList
Specify what to check for reading, writing, and exceptions,
respectively.
Together, they specify the selection criteria. Each of these
parameters
points to a sellist structure, which can specify both file
descriptors and
message queues. Your program must define the sellist structure in
the
following form:
struct sellist
{
ulong fdsmask[F]; /* file descriptor bit mask
*/
int msgids[M]; /* message queue identifiers */
};
The fdsmask array is treated as a bit string in which each bit
corresponds
to a file descriptor. File descriptor n is represented by the bit(1
<< (n
mod bits)) in the array element fdsmask[n / BITS(int)]. (The BITS
macro is
defined in the values.h file.) Each bit that is set to 1 indicates
that the
status of the corresponding file descriptor is to be checked.
Note: The low-order 16 bits of the Nfdsmsgs parameter specify the
number of
bits (not elements) in the fdsmask array that make up the file
descriptor
mask. If only part of the last int is included in the mask, the
appropriate
number of low-order bits are used, and the remaining high-order
bits are
ignored. If you set the low-order 16 bits of the Nfdsmsgs parameter
to 0,
you must not define an fdsmask array in the sellist structure.
Each int of the msgids array specifies a message queue identifier
whose
status is to be checked. Elements with a value of -1 are ignored.
The high-
order 16 bits of the Nfdsmsgs parameter specify the number of
elements in
the msgids array. If you set the high-order 16 bits of the Nfdsmsgs
parameter to 0, you must not define a msgids array in the sellist
structure.
Note: The arrays specified by the ReadList, WriteList, and
ExceptList
parameters are the same size because each of these parameters
points to the
same sellist structure type. However, you need not specify the same
number
of file descriptors or message queues in each. Set the file
descriptor bits
that are not of interest to 0, and set the extra elements of the
msgids
array to -1.
You can use the SELLIST macro defined in the sys/select.h file to
define the
sellist structure. The format of this macro is:
SELLIST(f, m) declarator . . . ;
where f specifies the size of the fdsmask array, m specifies the
size of the
msgids array, and each declarator is the name of a variable to be
declared
as having this type.
Return Values
Upon successful completion, the select subroutine returns a value that
indicates
the total number of file descriptors and message queues that satisfy the
selection criteria. The fdsmask bit masks are modified so that bits set
to 1
indicate file descriptors that meet the criteria. The msgids arrays are
altered
so that message queue identifiers that do not meet the criteria are
replaced with
a value of -1.
The return value is similar to the Nfdsmsgs parameter in that the
low-order 16
bits give the number of file descriptors, and the high-order 16 bits
give the
number of message queue identifiers. These values indicate the sum total
that
meet each of the read, write, and exception criteria. Therefore, the
same file
descriptor or message queue can be counted up to three times. You can
use the
NFDS and NMSGS macros found in the sys/select.h file to separate out
these two
values from the return value. For example, if rc contains the value
returned from
the select subroutine, NFDS(rc) is the number of files selected, and
NMSGS(rc) is
the number of message queues selected.
If the time limit specified by the TimeOut parameter expires, the select
subroutine returns a value of 0.
If a connection-based socket is specified in the Readlist parameter and
the
connection disconnects, the select subroutine returns successfully, but
the recv
subroutine on the socket will return a value of 0 to indicate the socket
connection has been closed.
For nonbloking connection-based sockets, both successful and unsuccessful
connections will cause the select subroutine to return successfully
without any
error.
When the connection completes successfully the socket becomes writable,
and if
the connection encounters an error the socket becomes both readable and
writable.
When using the select subroutine, you can not check any pending errors
on the
socket. You need to call the getsockopt subroutine with SOL_SOCKET and
SOL_ERROR
to check for a pending error.
If the select subroutine is unsuccessful, it returns a value of -1 and
sets the
global variable errno to indicate the error. In this case, the contents
of the
structures pointed to by the ReadList, WriteList, and ExceptList
parameters are
unpredictable.
Error Codes
The select subroutine is unsuccessful if one of the following are true:
Item
Description
EBADF
An invalid file descriptor or message queue identifier was
specified.
EAGAIN
Allocation of internal data structures was unsuccessful.
EINTR
A signal was caught during the select subroutine and the signal
handler was
installed with an indication that subroutines are not to be
restarted.
EINVAL
An invalid value was specified for the TimeOut parameter or the
Nfdsmsgs
parameter.
EINVAL
The STREAM or multiplexer referenced by one of the file descriptors
is
linked (directly or indirectly) downstream from a multiplexer.
EFAULT
The ReadList, WriteList, ExceptList, or TimeOut parameter points to
a
location outside of the address space of the process.
Examples
The following is an example of the behavior of the select subroutine
called on a
non-blocking socket, when trying to connect to a host that is
unreachable:
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netinet/tcp.h>
#include <fcntl.h>
#include <sys/time.h>
#include <errno.h>
#include <stdio.h>
int main()
{
int sockfd, cnt, i = 1;
struct sockaddr_in serv_addr;
bzero((char *)&serv_addr, sizeof (serv_addr));
serv_addr.sin_family = AF_INET;
serv_addr.sin_addr.s_addr = inet_addr("172.16.55.25");
serv_addr.sin_port = htons(102);
if ((sockfd = socket(AF_INET, SOCK_STREAM, 0)) < 0)
exit(1);
if (fcntl(sockfd, F_SETFL, FNONBLOCK) < 0)
exit(1);
if (connect(sockfd, (struct sockaddr *)&serv_addr, sizeof
(serv_addr)) < 0 && errno != EINPROGRESS)
exit(1);
for (cnt=0; cnt<2; cnt++) {
fd_set readfds, writefds;
FD_ZERO(&readfds);
FD_SET(sockfd, &readfds);
FD_ZERO(&writefds);
FD_SET(sockfd, &writefds);
if (select(sockfd + 1, &readfds, &writefds, NULL,
NULL) < 0)
exit(1);
printf("Iteration %d ==============\n", i);
printf("FD_ISSET(sockfd, &readfds) == %d\n",
FD_ISSET(sockfd, &readfds));
printf("FD_ISSET(sockfd, &writefds) == %d\n",
FD_ISSET(sockfd, &writefds));
i++;
}
return 0;
}
Here is the output of the above program :
Iteration 1 ==============
FD_ISSET(sockfd, &readfds) == 0
FD_ISSET(sockfd, &writefds) == 1
Iteration 2 ==============
FD_ISSET(sockfd, &readfds) == 1
FD_ISSET(sockfd, &writefds) == 1
In the first iteration, select notifies the write event only. In the
second
iteration, select notifies both the read and write events.
Notes
FD_SETSIZE is the #define variable that defines how many file
descriptors the
various FD macros will use. The default value for FD_SETSIZE is 65534
open file
descriptors. This value can not be set greater than OPEN_MAX.
For more information, refer to the /usr/include/sys/time.h file.
The user may override FD_SETSIZE to select a smaller value before
including the
system header files. This is desirable for performance reasons, because
of the
overhead in FD_ZERO to zero 65534 bits.
Performance Issues and Recommended Coding Practices
The select subroutine can be a very compute intensive system call,
depending on
the number of open file descriptors used and the lengths of the bit maps
used. Do
not follow the examples shown in many text books. Most were written when
the
number of open files supported was small, and thus the bit maps were
short. You
should avoid the following (where select is being passed FD_SETSIZE as
the number
of FDs to process):
select(FD_SETSIZE, ....)
Performance will be poor if the program uses FD_ZERO and the default
FD_SETSIZE.
FD_ZERO should not be used in any loops or before each select call.
However,
using it one time to zero the bit string will not cause problems. If you
plan to
use this simple programming method, you should override FD_SETSIZE to
define a
smaller number of FDs. For example, if your process will only open two
FDs that
you will be selecting on, and there will never be more than a few
hundred other
FDs open in the process, you should lower FD_SETSIZE to approximately
1024.
Do not pass FD_SETSIZE as the first parameter to select. This specifies
the
maximum number of file descriptors the system should check for. The
program
should keep track of the highest FD that has been assigned or use the
getdtablesize subroutine to determine this value. This saves passing
excessively
long bit maps in and out of the kernel and reduces the number of FDs
that select
must check.
Use the poll system call instead of select. The poll system call has the
same
functionality as select, but it uses a list of FDs instead of a bit map.
Thus, if
you are only selecting on a single FD, you would only pass one FD to
poll. With
select, you have to pass a bit map that is as long as the FD number
assigned for
that FD. If AIX assigned FD 4000, for example, you would have to pass a
bit map
4001 bits long.