Thanks Dave,
Filed the bug report http://www.paraview.org/Bug/view.php?id=12087
I updated the patch for 3.10.0 as well (attached here and on the bug
report).
Burlen
On 04/13/2011 11:24 AM, David Partyka wrote:
Humm, I forgot all about this email. I'll stick it in right now for
3.10.2. If you don't mind please file a bug so that it isn't forgotten.
On Wed, Apr 13, 2011 at 2:17 PM, Burlen Loring <blor...@lbl.gov
<mailto:blor...@lbl.gov>> wrote:
Hi Dave,
What is the status on this?
Burlen
On 02/27/2011 02:53 PM, David Partyka wrote:
Thanks Burlen, We'll take a look.
On Sun, Feb 27, 2011 at 5:18 PM, Burlen Loring <blor...@lbl.gov
<mailto:blor...@lbl.gov>> wrote:
Hi,
While installing ParaView on Nautilus,
http://www.nics.tennessee.edu/computing-resources/nautilus, I
hit a bug in vtkSocket that prevents ParaView from running on
this machine. While tracking this down I uncovered a couple
related issues.
The main issue is that vtkSocket does not handle EINTR. EINTR
occurs when a signal is caught by the application during a
blocking socket call. While ParaView does not make use of
signals they are used for asynchronous communication by some
SGI specific libraries on Nautilus that are linked in with
SGI MPI. Because Rank 0 pvserver spends quite a bit of its
time blocked in socket calls it only takes a few 10s of
seconds for EINTR to occur. When faced with EINTR ParaView
silently exits leaving the user wondering what the heck
happened. Which brings me to the second issue, a lack of
error reporting in vtkSocket.
To solve the first issue vtkSocket has to handle EINTR. How
EINTR should be handled depends on the specific socket call.
For all calls except connect the call can simply be
restarted. For EINTR during connect one can't restart the
call on all unix, so instead one must block in a select call
when connect fails with EINTR. To be portable across Unix one
should handle EINTR in all socket calls, even simple ones
like set/getsockopt.
The second issue of error reporting applies to all socket
related errors in general, my feeling is that when a socket
call fails vtkSocket should print a message using
vtkErrorMacro, errno, and strerror(or windows equivalent) at
the point of failure. I think this should be done inside
vtkSocket because this is the only place one can safely
assume errno has relevant information and vtkSocket has been
implemented returning a single error code, -1, so that
returning the real error code would change the API and break
existing code, including ParaView. Not to mention that the
values for error codes are apparently different on windows
and unix.
I took a stab at fixing these issues, patches attached. I
tested them on my workstation, nautilus, and laptop running
xp. I ran a dashboard on my linux workstation and didn't see
any related issues. Would someone at KW mind taking a look at
the changes and see if it could be made permanent?
By the way after testing all socket calls for error returns I
uncovered a third bug, vtkSocket::Close didn't set the
descriptor ivar to -1 which resulted in vtkSocket::~vtkSocket
calling close on a closed socket. Not a disasterous error,
but this reinforces my opinion that the returns should be
tested and error messages printed.
Thanks
Burlen
_______________________________________________
Powered by www.kitware.com <http://www.kitware.com>
Visit other Kitware open-source projects at
http://www.kitware.com/opensource/opensource.html
Please keep messages on-topic and check the ParaView Wiki at:
http://paraview.org/Wiki/ParaView
Follow this link to subscribe/unsubscribe:
http://www.paraview.org/mailman/listinfo/paraview
--- VTK/Common/vtkSocket.h 2011-04-14 10:50:32.655498357 -0700
+++ VTK/Common/vtkSocket.h 2011-04-14 10:51:53.335704415 -0700
@@ -36,7 +36,7 @@
// Description:
// Close the socket.
- void CloseSocket() {this->CloseSocket(this->SocketDescriptor);}
+ void CloseSocket();
// ------ Communication API ---
// Description:
--- VTK/Common/vtkSocket.cxx 2011-02-27 13:47:42.787573794 -0800
+++ VTK/Common/vtkSocket.cxx 2011-02-27 14:00:30.506483646 -0800
@@ -16,6 +16,13 @@
#include "vtkObjectFactory.h"
+#include <algorithm>
+using std::max;
+
+#if defined(__BORLANDC__)
+# pragma warn -8012 /* signed/unsigned comparison */
+#endif
+
// The VTK_SOCKET_FAKE_API definition is given to the compiler
// command line by CMakeLists.txt if there is no real sockets
// interface available. When this macro is defined we simply make
@@ -38,25 +45,99 @@
#include <unistd.h>
#include <sys/time.h>
#include <errno.h>
+ #include <string.h>
+ #include <stdio.h>
#endif
#endif
#if defined(_WIN32) && !defined(__CYGWIN__)
+
+// TODO : document why we restrict to v1.1
#define WSA_VERSION MAKEWORD(1,1)
-#define vtkCloseSocketMacro(sock) (closesocket(sock))
-#else
-#define vtkCloseSocketMacro(sock) (close(sock))
-#endif
-#if defined(__BORLANDC__)
-# pragma warn -8012 /* signed/unsigned comparison */
+#define vtkCloseSocketMacro(sock) (closesocket(sock))
+#define vtkErrnoMacro (WSAGetLastError())
+#define vtkStrerrorMacro(_num) (wsaStrerror(_num))
+#define vtkSocketErrorIdMacro(_id) (WSA##_id)
+#define vtkSocketErrorReturnMacro (SOCKET_ERROR)
+
+#else
+
+#define vtkCloseSocketMacro(sock) (close(sock))
+#define vtkErrnoMacro (errno)
+#define vtkStrerrorMacro(_num) (strerror(_num))
+#define vtkSocketErrorIdMacro(_id) (_id)
+#define vtkSocketErrorReturnMacro (-1)
+
+#endif
+
+// This macro wraps a system function call(_call),
+// restarting the call in case it was interrupted
+// by a signal (EINTR).
+#define vtkRestartInterruptedSystemCallMacro(_call,_ret)\
+ do \
+ { \
+ (_ret)=(_call); \
+ } \
+ while (((_ret)==vtkSocketErrorReturnMacro) \
+ && (vtkErrnoMacro==vtkSocketErrorIdMacro(EINTR)));
+
+// use when _str may be a null pointer but _fallback is not.
+#define vtkSafeStrMacro(_str,_fallback) ((_str)?(_str):(_fallback))
+
+// convert error number to string and report via vtkErrorMacro.
+#define vtkSocketErrorMacro(_eno, _message) \
+ vtkErrorMacro( \
+ << (_message) \
+ << " " \
+ << vtkSafeStrMacro( \
+ vtkStrerrorMacro(_eno), \
+ "unknown error") \
+ << ".");
+
+// this pointer is not accessable in a static member function
+#define vtkGenericErrorMacro(x) \
+{ if (vtkObject::GetGlobalWarningDisplay()) { \
+ vtkOStreamWrapper::EndlType endl; \
+ vtkOStreamWrapper::UseEndl(endl); \
+ vtkOStrStreamWrapper vtkmsg; \
+ vtkmsg \
+ << "Error: In " __FILE__ ", line " \
+ << __LINE__ << "\n" x \
+ << "\n\n"; \
+ vtkOutputWindowDisplayErrorText(vtkmsg.str()); \
+ vtkmsg.rdbuf()->freeze(0);}}
+
+// convert error number to string and report via vtkGenericErrorMacro
+#define vtkSocketGenericErrorMacro(_message) \
+ vtkGenericErrorMacro( \
+ << (_message) \
+ << " " \
+ << vtkSafeStrMacro( \
+ vtkStrerrorMacro(vtkErrnoMacro), \
+ "unknown error") \
+ << ".");
+
+// on windows sterror doesn't handle socket error codes
+#if defined(_WIN32) && !defined(__CYGWIN__)
+static
+const char *wsaStrerror(int wsaeid)
+{
+ static char buf[256]={'\0'};
+ int ok;
+ ok=FormatMessage(FORMAT_MESSAGE_FROM_SYSTEM,0,wsaeid,0,buf,256,0);
+ if (!ok)
+ {
+ return 0;
+ }
+ return buf;
+}
#endif
//-----------------------------------------------------------------------------
vtkSocket::vtkSocket()
{
this->SocketDescriptor = -1;
-
}
//-----------------------------------------------------------------------------
@@ -73,13 +154,26 @@
int vtkSocket::CreateSocket()
{
#ifndef VTK_SOCKET_FAKE_API
- int sock = socket(AF_INET, SOCK_STREAM, 0);
+ int sock;
+ vtkRestartInterruptedSystemCallMacro(socket(AF_INET,SOCK_STREAM, 0), sock);
+ if (sock == vtkSocketErrorReturnMacro)
+ {
+ vtkSocketErrorMacro(vtkErrnoMacro, "Socket error in call to socket.");
+ return -1;
+ }
+
// Elimate windows 0.2 second delay sending (buffering) data.
int on = 1;
- if (setsockopt(sock, IPPROTO_TCP, TCP_NODELAY, (char*)&on, sizeof(on)))
+ int iErr;
+ vtkRestartInterruptedSystemCallMacro(
+ setsockopt(sock, IPPROTO_TCP, TCP_NODELAY, (char*)&on, sizeof(on)),
+ iErr);
+ if (iErr == vtkSocketErrorReturnMacro)
{
+ vtkSocketErrorMacro(vtkErrnoMacro, "Socket error in call to setsockopt.");
return -1;
}
+
return sock;
#else
return -1;
@@ -87,6 +181,13 @@
}
//-----------------------------------------------------------------------------
+void vtkSocket::CloseSocket()
+{
+ this->CloseSocket(this->SocketDescriptor);
+ this->SocketDescriptor = -1;
+}
+
+//-----------------------------------------------------------------------------
int vtkSocket::BindSocket(int socketdescriptor, int port)
{
#ifndef VTK_SOCKET_FAKE_API
@@ -97,16 +198,31 @@
server.sin_port = htons(port);
// Allow the socket to be bound to an address that is already in use
int opt=1;
+ int iErr=~vtkSocketErrorReturnMacro;
#ifdef _WIN32
- setsockopt(socketdescriptor, SOL_SOCKET, SO_REUSEADDR, (char*) &opt, sizeof(int));
+ vtkRestartInterruptedSystemCallMacro(
+ setsockopt(socketdescriptor,SOL_SOCKET,SO_REUSEADDR,(char*)&opt,sizeof(int)),
+ iErr);
#elif defined(VTK_HAVE_SO_REUSEADDR)
- setsockopt(socketdescriptor, SOL_SOCKET, SO_REUSEADDR, (void *) &opt, sizeof(int));
+ vtkRestartInterruptedSystemCallMacro(
+ setsockopt(socketdescriptor,SOL_SOCKET,SO_REUSEADDR,(void*)&opt,sizeof(int)),
+ iErr);
#endif
+ if (iErr == vtkSocketErrorReturnMacro)
+ {
+ vtkSocketErrorMacro(vtkErrnoMacro, "Socket error in call to setsockopt.");
+ return -1;
+ }
- if ( bind(socketdescriptor, reinterpret_cast<sockaddr*>(&server), sizeof(server)) )
+ vtkRestartInterruptedSystemCallMacro(
+ bind(socketdescriptor,reinterpret_cast<sockaddr*>(&server),sizeof(server)),
+ iErr);
+ if (iErr == vtkSocketErrorReturnMacro)
{
+ vtkSocketErrorMacro(vtkErrnoMacro, "Socket error in call to bind.");
return -1;
}
+
return 0;
#else
static_cast<void>(socketdescriptor);
@@ -121,9 +237,20 @@
#ifndef VTK_SOCKET_FAKE_API
if (socketdescriptor < 0)
{
+ vtkErrorMacro("Invalid descriptor.");
return -1;
}
- return accept(socketdescriptor, 0, 0);
+
+ int newDescriptor;
+ vtkRestartInterruptedSystemCallMacro(
+ accept(socketdescriptor, 0, 0), newDescriptor);
+ if (newDescriptor == vtkSocketErrorReturnMacro)
+ {
+ vtkSocketErrorMacro(vtkErrnoMacro, "Socket error in call to accept.");
+ return -1;
+ }
+
+ return newDescriptor;
#else
static_cast<void>(socketdescriptor);
return -1;
@@ -136,9 +263,19 @@
#ifndef VTK_SOCKET_FAKE_API
if (socketdescriptor < 0)
{
+ vtkErrorMacro("Invalid descriptor.");
+ return -1;
+ }
+
+ int iErr;
+ vtkRestartInterruptedSystemCallMacro(listen(socketdescriptor, 1), iErr);
+ if (iErr == vtkSocketErrorReturnMacro)
+ {
+ vtkSocketErrorMacro(vtkErrnoMacro, "Socket error in call to listen.");
return -1;
}
- return listen(socketdescriptor, 1);
+
+ return 0;
#else
static_cast<void>(socketdescriptor);
return -1;
@@ -151,32 +288,54 @@
#ifndef VTK_SOCKET_FAKE_API
if (socketdescriptor < 0 )
{
- // invalid socket descriptor.
+ vtkErrorMacro("Invalid descriptor.");
return -1;
}
-
+
fd_set rset;
- struct timeval tval;
- struct timeval* tvalptr = 0;
- if ( msec > 0 )
+ int res;
+ do
{
- tval.tv_sec = msec / 1000;
- tval.tv_usec = (msec % 1000)*1000;
- tvalptr = &tval;
+ struct timeval tval;
+ struct timeval* tvalptr = 0;
+ if (msec>0)
+ {
+ tval.tv_sec = msec / 1000;
+ tval.tv_usec = (msec % 1000)*1000;
+ tvalptr = &tval;
+ }
+
+ FD_ZERO(&rset);
+ FD_SET(socketdescriptor, &rset);
+
+ // block until socket is readable.
+ res = select(socketdescriptor+1, &rset, 0, 0, tvalptr);
}
- FD_ZERO(&rset);
- FD_SET(socketdescriptor, &rset);
- int res = select(socketdescriptor + 1, &rset, 0, 0, tvalptr);
- if(res == 0)
+ while ((res == vtkSocketErrorReturnMacro)
+ && (vtkErrnoMacro == vtkSocketErrorIdMacro(EINTR)));
+
+ if (res == 0)
{
- return 0;//for time limit expire
+ // time out
+ return 0;
}
-
- if ( res < 0 || !(FD_ISSET(socketdescriptor, &rset)) )
+ else
+ if (res == vtkSocketErrorReturnMacro)
{
- // Some error.
+ // error in the call
+ vtkSocketErrorMacro(vtkErrnoMacro, "Socket error in call to select.");
return -1;
}
+ else
+ if (!FD_ISSET(socketdescriptor, &rset))
+ {
+ vtkErrorMacro("Socket error in select. Descriptor not selected.");
+ return -1;
+ }
+
+ // NOTE: not checking for pending errors,these will be handled
+ // in the next call to read/recv
+
// The indicated socket has some activity on it.
return 1;
#else
@@ -191,50 +350,70 @@
unsigned long msec, int* selected_index)
{
#ifndef VTK_SOCKET_FAKE_API
- int i;
- int max_fd = -1;
+
*selected_index = -1;
- if (size < 0)
+
+ if (size < 0)
{
+ vtkGenericErrorMacro("Can't select fewer than 0.");
return -1;
}
-
+
fd_set rset;
- struct timeval tval;
- struct timeval* tvalptr = 0;
- if ( msec > 0 )
- {
- tval.tv_sec = msec / 1000;
- tval.tv_usec = msec % 1000;
- tvalptr = &tval;
- }
- FD_ZERO(&rset);
- for (i=0; i<size; i++)
+ int res = -1;
+ do
{
- FD_SET(sockets_to_select[i],&rset);
- max_fd = (sockets_to_select[i] > max_fd)? sockets_to_select[i] : max_fd;
+ struct timeval tval;
+ struct timeval* tvalptr = 0;
+ if (msec>0)
+ {
+ tval.tv_sec = msec / 1000;
+ tval.tv_usec = msec % 1000;
+ tvalptr = &tval;
+ }
+
+ FD_ZERO(&rset);
+ int max_fd = -1;
+ for (int i=0; i<size; i++)
+ {
+ FD_SET(sockets_to_select[i],&rset);
+ max_fd = max(sockets_to_select[i],max_fd);
+ }
+
+ // block until one socket is ready to read.
+ res = select(max_fd + 1, &rset, 0, 0, tvalptr);
}
-
- int res = select(max_fd + 1, &rset, 0, 0, tvalptr);
- if (res == 0)
+ while ((res == vtkSocketErrorReturnMacro)
+ && (vtkErrnoMacro == vtkSocketErrorIdMacro(EINTR)));
+
+ if (res==0)
{
- return 0; //Timeout
+ // time out
+ return 0;
}
- if (res < 0)
+ else
+ if (res == vtkSocketErrorReturnMacro)
{
- // SelectSocket error.
+ // error in the call
+ vtkSocketGenericErrorMacro("Socket error in call to select.");
return -1;
}
-
- //check which socket has some activity.
- for (i=0; i<size; i++)
+
+ // find the first socket which has some activity.
+ for (int i=0; i<size; i++)
{
if ( FD_ISSET(sockets_to_select[i],&rset) )
{
+ // NOTE: not checking for pending errors, these
+ // will be handled in the next call to read/recv
+
*selected_index = i;
return 1;
}
}
+
+ // no activity on any of the sockets
+ vtkGenericErrorMacro("Socket error in select. No descriptor selected.");
return -1;
#else
static_cast<void>(sockets_to_select);
@@ -251,6 +430,7 @@
#ifndef VTK_SOCKET_FAKE_API
if (socketdescriptor < 0)
{
+ vtkErrorMacro("Invalid descriptor.");
return -1;
}
@@ -261,10 +441,9 @@
unsigned long addr = inet_addr(hostName);
hp = gethostbyaddr((char *)&addr, sizeof(addr), AF_INET);
}
-
if (!hp)
{
- // vtkErrorMacro("Unknown host: " << hostName);
+ vtkErrorMacro("Unknown host: " << hostName);
return -1;
}
@@ -273,8 +452,48 @@
memcpy(&name.sin_addr, hp->h_addr, hp->h_length);
name.sin_port = htons(port);
- return connect(socketdescriptor, reinterpret_cast<sockaddr*>(&name),
- sizeof(name));
+ int iErr
+ = connect(socketdescriptor, reinterpret_cast<sockaddr*>(&name),sizeof(name));
+ if ( (iErr == vtkSocketErrorReturnMacro )
+ && (vtkErrnoMacro == vtkSocketErrorIdMacro(EINTR)) )
+ {
+ // Restarting an interrupted connect call only works on linux,
+ // other unix require a call to select which blocks until the
+ // connection is complete.
+ // See Stevens 2d ed, 15.4 p413, "interrupted connect"
+ iErr = this->SelectSocket(socketdescriptor,0);
+ if (iErr == -1)
+ {
+ // SelectSocket doesn't test for pending errors.
+ int pendingErr;
+ socklen_t pendingErrLen=sizeof(pendingErr);
+ vtkRestartInterruptedSystemCallMacro(
+ getsockopt(socketdescriptor, SOL_SOCKET, SO_ERROR, &pendingErr, &pendingErrLen),
+ iErr);
+ if (iErr == vtkSocketErrorReturnMacro)
+ {
+ vtkSocketErrorMacro(
+ vtkErrnoMacro, "Socket error in call to getsockopt.");
+ return -1;
+ }
+ else
+ if (pendingErr)
+ {
+ vtkSocketErrorMacro(
+ pendingErr, "Socket error pending from call to connect.");
+ return -1;
+ }
+ }
+ }
+ else
+ if (iErr == vtkSocketErrorReturnMacro)
+ {
+ vtkSocketErrorMacro(
+ vtkErrnoMacro, "Socket error in call to connect.");
+ return -1;
+ }
+
+ return 0;
#else
static_cast<void>(socketdescriptor);
static_cast<void>(hostName);
@@ -294,8 +513,14 @@
#else
int sizebuf = sizeof(sockinfo);
#endif
- if(getsockname(sock, reinterpret_cast<sockaddr*>(&sockinfo), &sizebuf) != 0)
+ int iErr;
+ vtkRestartInterruptedSystemCallMacro(
+ getsockname(sock, reinterpret_cast<sockaddr*>(&sockinfo), &sizebuf),
+ iErr);
+ if (iErr == vtkSocketErrorReturnMacro)
{
+ vtkSocketErrorMacro(
+ vtkErrnoMacro, "Socket error in call to getsockname.");
return 0;
}
return ntohs(sockinfo.sin_port);
@@ -311,9 +536,18 @@
#ifndef VTK_SOCKET_FAKE_API
if (socketdescriptor < 0)
{
+ vtkErrorMacro("Invalid descriptor.");
return;
}
- vtkCloseSocketMacro(socketdescriptor);
+ int iErr;
+ vtkRestartInterruptedSystemCallMacro(
+ vtkCloseSocketMacro(socketdescriptor),
+ iErr);
+ if (iErr == vtkSocketErrorReturnMacro)
+ {
+ vtkSocketErrorMacro(
+ vtkErrnoMacro, "Socket error in call to close/closesocket.");
+ }
#else
static_cast<void>(socketdescriptor);
return;
@@ -326,6 +560,7 @@
#ifndef VTK_SOCKET_FAKE_API
if (!this->GetConnected())
{
+ vtkErrorMacro("Not connected.");
return 0;
}
if (length == 0)
@@ -337,22 +572,19 @@
int total = 0;
do
{
- int flags;
-#if defined(_WIN32) && !defined(__CYGWIN__)
- flags = 0;
-#else
- // disabling, since not present on SUN.
- // flags = MSG_NOSIGNAL; //disable signal on Unix boxes.
- flags = 0;
-#endif
- int n = send(this->SocketDescriptor, buffer+total, length-total, flags);
- if(n < 0)
+ int flags=0;
+ int nSent;
+ vtkRestartInterruptedSystemCallMacro(
+ send(this->SocketDescriptor, buffer+total, length-total, flags),
+ nSent);
+ if (nSent == vtkSocketErrorReturnMacro)
{
- vtkErrorMacro("Socket Error: Send failed.");
+ vtkSocketErrorMacro(vtkErrnoMacro, "Socket error in call to send.");
return 0;
}
- total += n;
+ total += nSent;
} while(total < length);
+
return 1;
#else
static_cast<void>(data);
@@ -367,39 +599,49 @@
#ifndef VTK_SOCKET_FAKE_API
if (!this->GetConnected())
{
+ vtkErrorMacro("Not connected.");
return 0;
}
+#if defined(_WIN32) && !defined(__CYGWIN__)
+ int trys = 0;
+#endif
+
char* buffer = reinterpret_cast<char*>(data);
int total = 0;
do
{
-#if defined(_WIN32) && !defined(__CYGWIN__)
- int trys = 0;
-#endif
- int n = recv(this->SocketDescriptor, buffer+total, length-total, 0);
- if(n < 1)
+ int nRecvd;
+ vtkRestartInterruptedSystemCallMacro(
+ recv(this->SocketDescriptor, buffer+total, length-total, 0),
+ nRecvd);
+
+ if (nRecvd == 0)
{
+ // peer shut down
+ return 0;
+ }
+
#if defined(_WIN32) && !defined(__CYGWIN__)
+ if ((nRecvd == vtkSocketErrorReturnMacro)
+ && (WSAGetLastError() == WSAENOBUFS))
+ {
// On long messages, Windows recv sometimes fails with WSAENOBUFS, but
// will work if you try again.
- int error = WSAGetLastError();
- if ((error == WSAENOBUFS) && (trys++ < 1000))
+ if ((trys++ < 1000))
{
Sleep(1);
continue;
}
-#else
- // On unix, a recv may be interrupted by a signal. In this case we should
- // retry.
- int errorNumber = errno;
- if (errorNumber == EINTR) continue;
-#endif
- vtkErrorMacro("Socket Error: Receive failed.");
+ vtkSocketErrorMacro(vtkErrnoMacro, "Socket error in call to recv.");
return 0;
}
- total += n;
- } while(readFully && total < length);
+#endif
+
+ total += nRecvd;
+ }
+ while( readFully && (total < length));
+
return total;
#else
static_cast<void>(data);
_______________________________________________
Powered by www.kitware.com
Visit other Kitware open-source projects at
http://www.kitware.com/opensource/opensource.html
Please keep messages on-topic and check the ParaView Wiki at:
http://paraview.org/Wiki/ParaView
Follow this link to subscribe/unsubscribe:
http://www.paraview.org/mailman/listinfo/paraview