subject:"\[HACKERS\] Escaping from blocked send reprised."

Re: [HACKERS] Escaping from blocked send() reprised.

2015-01-13 Thread Andres Freund

On 2015-01-12 00:40:50 +0100, Andres Freund wrote:
 Fixed in what I've since pushed (as Heikki basically was ok with the
 patch sent a couple months back, modulo some fixes)...

I'd not actually pushed that patch... I had pushed some patches
(barriers, atomics), but had decided to hold off on this. I've now done
so.

I've mentioned the portability concerns over select() bugs in the commit
message  a comment. ATM I'm not inclined to add a relatively elaborate
test for the bug on pretty fringe platforms.

Thanks for looking at this!

I plan to continue with committing
1) Commonalize process startup code
2) Add a default local latch for use in signal handlers
3) Use a nonblocking socket for FE/BE communication and block using latches

pretty soon.

As we already seem to assume that WaitLatch() is signal safe/reentrant
(c.f. walsender.c), I'm fine with committing 3) in isolation, without
4). I need a test that properly exercises catchup interrupts before
committing that.

Once I have that test I plan to commit
4) Introduce and use infrastructure for interrupt processing during
client reads.

I'd like some input from others what they think about the problem that
5) Process 'die' interrupts while reading/writing from a socket.

can reduce the likelihood of clients getting the error message. I
personally think that's more than outweighed by not having backends
stuck (save quickdie) for a long time when the client is gone/stuck. I
think the middleground in the patch to only process die events when
actually blocked in writes reduces the likelihood of this sufficiently.

I have hacks ontop this to get rid of ImmediateInterrupt alltogether,
although I'm not sure how well this will work for some parts of
auth/crypt.c. Everything else, including the deadlock checker, seems
quite doable.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2015-01-11 Thread Noah Misch

On Mon, Jan 12, 2015 at 01:45:41AM +0100, Andres Freund wrote:
 On 2015-01-11 19:37:53 -0500, Noah Misch wrote:
  I recommend either (a) taking no action or (b) adding a regression test
  verifying WaitLatchOrSocket() conformance in this scenario.
 
 Do you have a good idea how to test b) save a C function in regress.c
 that does what your test does using latches?

No, that's what I had in mind.  You could probably achieve it with a libpq
program that lets input accumulate, but that's trickier.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2015-01-11 Thread Andres Freund

On 2015-01-11 19:37:53 -0500, Noah Misch wrote:
 I recommend either (a) taking no action or (b) adding a regression test
 verifying WaitLatchOrSocket() conformance in this scenario.

Do you have a good idea how to test b) save a C function in regress.c
that does what your test does using latches?

 Then we cane decide what more to do if failure evidence emerges.

Seems fine to me.

  Hm. I can think of two stopgap measures we could add:
  
  1) If we're using select() and WL_SOCKET_WRITEABLE is set without
 _READABLE, add a timeout of Min(1s, Max(passed_timeout, 1s)). As the
 time spent waiting only for writable normally shouldn't be very long,
 that shouldn't be noticeably bad for power usage.
  2) Add a SIGPIPE handler that just does a SetLatch(MyLach).
 
 I'm having trouble visualizing those proposed measures in detail, but I trust
 that a decent workaround would emerge.

For 1) I'm thinking of just regularly causing a spurious
WL_SOCKET_WRITEABLE event via timeouts if it's the only parameter. Latch
users have to deal with spurious wakeups anyway, so that should be
mostly unproblematic.

For 2) I was thinking that for now the problem only arises for the main
FE/BE socket. So we can install a sigpipe handler that does a SetLatch()
- that will trigger WaitLatch() to return and, after checking for
interrupts, retry the actual send() - which then'd return ECONNRESET.

  What happens if you write/send() in
  that state, btw?
 
 write() reports EAGAIN.

Grand.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2015-01-11 Thread Robert Haas

On Sat, Jan 10, 2015 at 11:35 AM, Andres Freund and...@2ndquadrant.com wrote:
 Interesting. I dimly remembered you mentioning this, that's how I
 rediscovered this message.

 Do you remember any details?

No, not really.

 My guess that's not so much the overhead of the latch itself, but the
 lack of the directed wakeup stuff the OS provides for semaphores.

That's possible.

 If we could replace all usages of semaphores that set immediate
 interrupts to ok, we could quite easily make the deadlock detector
 et. al. run outside of signal handlers. That would imo make it more
 robust, and easier to understand - right now the correctness of locking
 done in the deadlock detector isn't obvious.  With the infrastructure in
 place it'd also allow your new parallelism code to run outside of signal
 handlers.

Yes, I would be very happy to see ImmediateInterruptOK die in a fire.

 Unfortunately currently sempahores can't be unlocked in a signal handler
 (as sysv semaphores aren't signal safe)... It'd also be not so nice to
 set both a latch and semaphores in every signal handler.

Agreed.

 AIUI, the only reason why we need the self-pipe thing is because on
 some platforms signals don't interrupt system calls.  But my
 impression was that those platforms were somewhat obscure.

 To the contrary, I think it's only very obscure platforms where signals
 still interrupt syscalls - we set SA_RESTART for pretty much
 everything. There's a couple of system calls that ignore SA_RESTART. For
 some that's defined in posix, for others it's operating system
 specific. E.g. on linux semop(), poll(), select() are defined to always
 return EINTR when interrupted.

The recent problems with test_shm_mq test failing on anole were caused
by the fact that a signal doesn't abort select() on that platform, but
does reset the timer.  So a steady stream of signals results in never
reaching the timeout.

 Anyway, the discussion since cleared up that we need the self byte to
 handle a race, anyway.

Eh?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2015-01-11 Thread Noah Misch

On Mon, Jan 12, 2015 at 12:40:50AM +0100, Andres Freund wrote:
 On 2015-01-11 16:36:07 -0500, Noah Misch wrote:
  On Sat, Jan 10, 2015 at 03:25:42AM +0100, Andres Freund wrote:
   0001-Allow-latches-to-wait-for-socket-writability-without.patch
Imo pretty close to commit and can be committed independently.
  
  The key open question is whether all platforms of interest can reliably 
  detect
  end-of-file when poll()ing or select()ing for write only.  Older GNU/Linux
  select() cannot; see attached test program.
 
 Yuck. By my reading that's a violation of posix.

Agreed.

 I did test it a bit, and I didn't see problems, but that obviously
 doesn't say much about old versions.
 
 Afaics we interestingly don't have any poll-less buildfarm animals that
 use unix_latch.c...

More likely is that some system will have a poll() exhibiting the same bug,
possibly via poll() being a wrapper around select().  Systems without poll()
are hard to find; the gnulib manual lists only Windows, BeOS and HP NonStop
OS.  HP NonStop OS is the one possibly of interest here.  I have never
personally seen a machine running it.

I recommend either (a) taking no action or (b) adding a regression test
verifying WaitLatchOrSocket() conformance in this scenario.  Then we can
decide what more to do if failure evidence emerges.

  We use poll() there anyway, so the bug in that configuration does not
  affect PostgreSQL.  Is it a bellwether of similar bugs in other
  implementations, bugs that will affect PostgreSQL?
 
 Hm. I can think of two stopgap measures we could add:
 
 1) If we're using select() and WL_SOCKET_WRITEABLE is set without
_READABLE, add a timeout of Min(1s, Max(passed_timeout, 1s)). As the
time spent waiting only for writable normally shouldn't be very long,
that shouldn't be noticeably bad for power usage.
 2) Add a SIGPIPE handler that just does a SetLatch(MyLach).

I'm having trouble visualizing those proposed measures in detail, but I trust
that a decent workaround would emerge.

   + if (pfds[0].revents  (POLLHUP | POLLERR | POLLNVAL))
   + {
   + /* EOF/error condition */
   + if (wakeEvents  WL_SOCKET_READABLE)
   + result |= WL_SOCKET_READABLE;
   + if (wakeEvents  WL_SOCKET_WRITEABLE)
   + result |= WL_SOCKET_WRITEABLE;
   + }
  
  With some poll() implementations (e.g. OS X), this can wrongly report
  WL_SOCKET_WRITEABLE if the peer used shutdown(SHUT_WR).  I tentatively think
  that's acceptable.  libpq does not use shutdown(), and other client 
  interfaces
  would do so at their own risk.  Should we worry about hostile clients 
  creating
  a denial-of-service by causing a server send() to block unexpectedly?
  Probably not; a user able to send arbitrary TCP traffic to the postmaster 
  port
  can already achieve that.
 
 Yea, this doesn't seem particularly concerning.
 
 a) They can just stop consuming writes and use noticeable amounts of
 memory by doing output intensive queries. That uses significant os
 resources and is much harder to detect - today.

If there's anything to worry about here (unlikely), it would be with respect
to not-yet-authenticated connections only.

 b) does accepting WL_SOCKET_WRITEABLE without _READABLE change anything
 here? We already allow _WRITABLE...

Today's code translates POLLHUP to WL_SOCKET_READABLE; it must see POLLOUT to
set WL_SOCKET_WRITEABLE.  Your patch changes that.

 What happens if you write/send() in
 that state, btw?

write() reports EAGAIN.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2015-01-11 Thread Noah Misch

On Sat, Jan 10, 2015 at 03:25:42AM +0100, Andres Freund wrote:
 0001-Allow-latches-to-wait-for-socket-writability-without.patch
  Imo pretty close to commit and can be committed independently.

The key open question is whether all platforms of interest can reliably detect
end-of-file when poll()ing or select()ing for write only.  Older GNU/Linux
select() cannot; see attached test program.  We use poll() there anyway, so
the bug in that configuration does not affect PostgreSQL.  Is it a bellwether
of similar bugs in other implementations, bugs that will affect PostgreSQL?

 This previously had explicitly been forbidden in e42a21b9e6c9, as
 there was no use case at that point. We now are looking into making
 FE/BE communication use latches, so it

Truncated sentence.

 + if (pfds[0].revents  (POLLHUP | POLLERR | POLLNVAL))
 + {
 + /* EOF/error condition */
 + if (wakeEvents  WL_SOCKET_READABLE)
 + result |= WL_SOCKET_READABLE;
 + if (wakeEvents  WL_SOCKET_WRITEABLE)
 + result |= WL_SOCKET_WRITEABLE;
 + }

With some poll() implementations (e.g. OS X), this can wrongly report
WL_SOCKET_WRITEABLE if the peer used shutdown(SHUT_WR).  I tentatively think
that's acceptable.  libpq does not use shutdown(), and other client interfaces
would do so at their own risk.  Should we worry about hostile clients creating
a denial-of-service by causing a server send() to block unexpectedly?
Probably not; a user able to send arbitrary TCP traffic to the postmaster port
can already achieve that.

 + if (resEvents.lNetworkEvents  FD_CLOSE)
 + {
 + if (wakeEvents  WL_SOCKET_READABLE)
 + result |= WL_SOCKET_READABLE;
 + if (wakeEvents  WL_SOCKET_WRITEABLE)
 + result |= WL_SOCKET_WRITEABLE;
 + }
 +
   }

Extra blank line.
/*
 * Test whether select() can report write-ready on a peer-closed TCP socket when
 * the send buffer is full.  Though write() won't block, some GNU/Linux systems
 * fail to report write-ready.  RHEL 6.6 (kernel-2.6.32-431.23.3.el6.x86_64,
 * glibc-2.12-1.149.el6.x86_64) has the bug.  RHEL 7.0
 * (kernel-3.10.0-123.8.1.el7.x86_64, glibc-2.17-55.el7_0.3.x86_64) does not.
 */

#include errno.h
#include fcntl.h
#include limits.h
#include netinet/in.h
#include signal.h
#include stdio.h
#include stdlib.h
#include sys/socket.h
#include sys/time.h
#include sys/types.h
#include unistd.h

/* Check a syscall return value. */
void
CSYS(int res)
{
if (res  0)
{
perror(some syscall);
exit(EXIT_FAILURE);
}
}

/* Like socketpair(), but use a loopback TCP connection. */
static void tcppair(int fd[2])
{
struct sockaddr_in addr;
int srv;
int one = 1;
int flags;

addr.sin_family = AF_INET;
addr.sin_port = htons(17531);
addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);

CSYS(srv = socket(AF_INET, SOCK_STREAM, 0));
CSYS(setsockopt(srv, SOL_SOCKET, SO_REUSEADDR,
(char *) one, sizeof(one)));
CSYS(bind(srv, (struct sockaddr *) addr, sizeof(addr)));
CSYS(listen(srv, 8));

CSYS(fd[1] = socket(AF_INET, SOCK_STREAM, 0));
CSYS(flags = fcntl(fd[1], F_GETFL));
CSYS(fcntl(fd[1], F_SETFL, flags | O_NONBLOCK));
if (connect(fd[1],  (struct sockaddr *) addr, sizeof(addr)) = 0 ||
errno != EINPROGRESS)
{
perror(connect);
exit(EXIT_FAILURE);
}
CSYS(fcntl(fd[1], F_SETFL, flags));

CSYS(fd[0] = accept(srv, NULL, NULL));
CSYS(close(srv));
}

/* Does select() consider the fd ready for writing? */
int
select_write(int fd)
{
   fd_set writes;
   struct timeval timeo;

   FD_ZERO(writes);
   FD_SET(fd, writes);
   timeo.tv_sec = 1;
   timeo.tv_usec = 0;
   CSYS(select(fd + 1, NULL, writes, NULL, timeo));
   return FD_ISSET(fd, writes);
}

/* Write to a socket until writes would block. */
void
fill(int fd)
{
int flags;
int total;

CSYS(flags = fcntl(fd, F_GETFL));
CSYS(fcntl(fd, F_SETFL, flags | O_NONBLOCK));

/*
 * On both Linux and OS X, select() can report the socket as write-ready
 * immediately after a write() reported EAGAIN.  Loop until both sources
 * agree that the socket is out of storage.
 */
do
{
char buf[32 * PIPE_BUF];
int n;

total = 0;
while ((n = write(fd, buf, sizeof(buf)))  0)
total += n;
if (errno != EAGAIN)
perror(write);

printf(wrote %d bytes\n, total);
} while (total  0  select_write(fd));

CSYS(fcntl(fd, F_SETFL, flags));
}

int
main(int argc, char **argv)
{
int fd[2];

signal(SIGPIPE, SIG_IGN);
tcppair(fd);

printf(before close:%s

Re: [HACKERS] Escaping from blocked send() reprised.

2015-01-11 Thread Andres Freund

On 2015-01-11 16:47:53 -0500, Robert Haas wrote:
  My guess that's not so much the overhead of the latch itself, but the
  lack of the directed wakeup stuff the OS provides for semaphores.
 
 That's possible.

 On Sat, Jan 10, 2015 at 11:35 AM, Andres Freund and...@2ndquadrant.com 
 wrote:
  Interesting. I dimly remembered you mentioning this, that's how I
  rediscovered this message.
 
  Do you remember any details?
 
 No, not really.

I've done a hackish conversion of proc.c to use semaphores and in a
sleeping heavy workload (pgbench -j 390 against a scale 1 db) there
originally was about a %5 regression (fluctuating a fair bit).

I've played with hacking unix_latch.c to

a) use eventfd() for the local latch and (no performance difference
   anymore)

b) Adding a eventfd to struct Latch for latches that are created from
postmaster. That fd can directly be written to by other processes
(combined slightly faster than semaphores).

Not sure how much b) is acceptable, due to using MaxBackend fds. And
both only help linux.

  If we could replace all usages of semaphores that set immediate
  interrupts to ok, we could quite easily make the deadlock detector
  et. al. run outside of signal handlers. That would imo make it more
  robust, and easier to understand - right now the correctness of locking
  done in the deadlock detector isn't obvious.  With the infrastructure in
  place it'd also allow your new parallelism code to run outside of signal
  handlers.
 
 Yes, I would be very happy to see ImmediateInterruptOK die in a fire.

Yea, I think it's a absolutely horrible idea.

  Unfortunately currently sempahores can't be unlocked in a signal handler
  (as sysv semaphores aren't signal safe)... It'd also be not so nice to
  set both a latch and semaphores in every signal handler.
 
 Agreed.

I've since (re-)realised that we've actually relied on semaphore being
signal safe for years. The PGSemaphoreLock() in proc.c allows
interrupts. And the deadlock detector then uses lwlocks, which in turn
use semaphores.  That sucks.

  Anyway, the discussion since cleared up that we need the self byte to
  handle a race, anyway.
 
 Eh?

I'm referring to the point Heikki has made in his second paragraph in
5408638c.1080...@vmware.com .


Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2015-01-11 Thread Andres Freund

On 2015-01-11 16:36:07 -0500, Noah Misch wrote:
 On Sat, Jan 10, 2015 at 03:25:42AM +0100, Andres Freund wrote:
  0001-Allow-latches-to-wait-for-socket-writability-without.patch
   Imo pretty close to commit and can be committed independently.
 
 The key open question is whether all platforms of interest can reliably detect
 end-of-file when poll()ing or select()ing for write only.  Older GNU/Linux
 select() cannot; see attached test program.

Yuck. By my reading that's a violation of posix.

I did test it a bit, and I didn't see problems, but that obviously
doesn't say much about old versions.

Afaics we interestingly don't have any poll-less buildfarm animals that
use unix_latch.c...

 We use poll() there anyway, so the bug in that configuration does not
 affect PostgreSQL.  Is it a bellwether of similar bugs in other
 implementations, bugs that will affect PostgreSQL?

Hm. I can think of two stopgap measures we could add:

1) If we're using select() and WL_SOCKET_WRITEABLE is set without
   _READABLE, add a timeout of Min(1s, Max(passed_timeout, 1s)). As the
   time spent waiting only for writable normally shouldn't be very long,
   that shouldn't be noticeably bad for power usage.
2) Add a SIGPIPE handler that just does a SetLatch(MyLach).

  This previously had explicitly been forbidden in e42a21b9e6c9, as
  there was no use case at that point. We now are looking into making
  FE/BE communication use latches, so it
 
 Truncated sentence.

Fixed in what I've since pushed (as Heikki basically was ok with the
patch sent a couple months back, modulo some fixes)...

  +   if (pfds[0].revents  (POLLHUP | POLLERR | POLLNVAL))
  +   {
  +   /* EOF/error condition */
  +   if (wakeEvents  WL_SOCKET_READABLE)
  +   result |= WL_SOCKET_READABLE;
  +   if (wakeEvents  WL_SOCKET_WRITEABLE)
  +   result |= WL_SOCKET_WRITEABLE;
  +   }
 
 With some poll() implementations (e.g. OS X), this can wrongly report
 WL_SOCKET_WRITEABLE if the peer used shutdown(SHUT_WR).  I tentatively think
 that's acceptable.  libpq does not use shutdown(), and other client interfaces
 would do so at their own risk.  Should we worry about hostile clients creating
 a denial-of-service by causing a server send() to block unexpectedly?
 Probably not; a user able to send arbitrary TCP traffic to the postmaster port
 can already achieve that.

Yea, this doesn't seem particularly concerning.

a) They can just stop consuming writes and use noticeable amounts of
memory by doing output intensive queries. That uses significant os
resources and is much harder to detect - today.

b) does accepting WL_SOCKET_WRITEABLE without _READABLE change anything
here? We already allow _WRITABLE... What happens if you write/send() in
that state, btw?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2015-01-10 Thread Andres Freund

On 2014-09-04 08:49:22 -0400, Robert Haas wrote:
 On Tue, Sep 2, 2014 at 3:01 PM, Andres Freund and...@2ndquadrant.com wrote:
  I'm slightly worried about the added overhead due to the latch code. In
  my implementation I only use latches after a nonblocking read, but
  still. Every WaitLatchOrSocket() does a drainSelfPipe(). I wonder if
  that can be made problematic.
 
 I think that's not the word you're looking for.

There's a less missing...

 At some point I hacked up a very crude prototype that made LWLocks use
 latches to sleep instead of semaphores.  It was slow.

Interesting. I dimly remembered you mentioning this, that's how I
rediscovered this message.

Do you remember any details?

My guess that's not so much the overhead of the latch itself, but the
lack of the directed wakeup stuff the OS provides for semaphores.


If we could replace all usages of semaphores that set immediate
interrupts to ok, we could quite easily make the deadlock detector
et. al. run outside of signal handlers. That would imo make it more
robust, and easier to understand - right now the correctness of locking
done in the deadlock detector isn't obvious.  With the infrastructure in
place it'd also allow your new parallelism code to run outside of signal
handlers.

Unfortunately currently sempahores can't be unlocked in a signal handler
(as sysv semaphores aren't signal safe)... It'd also be not so nice to
set both a latch and semaphores in every signal handler.


 AIUI, the only reason why we need the self-pipe thing is because on
 some platforms signals don't interrupt system calls.  But my
 impression was that those platforms were somewhat obscure.

To the contrary, I think it's only very obscure platforms where signals
still interrupt syscalls - we set SA_RESTART for pretty much
everything. There's a couple of system calls that ignore SA_RESTART. For
some that's defined in posix, for others it's operating system
specific. E.g. on linux semop(), poll(), select() are defined to always
return EINTR when interrupted.

Anyway, the discussion since cleared up that we need the self byte to
handle a race, anyway.

 Basically, it doesn't feel like a good thing that we've got two sets
 of primitives for making a backend wait that (1) don't really know
 about each other and (2) use different operating system primitives.
 Presumably one of the two systems is better; let's figure out which
 one it is, use that one all the time, and get rid of the other one.

I think the latch interface is clearly better for what we use
sema/latches for as it allows to wait for signals (latch sets), a socket
and timeouts. So let's try to figure out how to make it perform
comparably or better than semaphores.

There's imo only one semaphore user that can't trivially be replaced by
latches: the semaphore spinlock emulation. Both proc.c and and lwlock.c
can be converted quite easily - in the latter case, it might actually
end up saving some code.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2015-01-09 Thread Andres Freund

On 2014-11-17 18:22:54 +0300, Alex Shulgin wrote:
 
 Andres Freund and...@2ndquadrant.com writes:
 
  I've invested some more time in this:
  0002 now makes sense on its own and doesn't change anything around the
   interrupt handling. Oh, and it compiles without 0003.
 
 In this patch, the endif appears to be misplaced in PostgresMain:
 
 + if (MyProcPort != NULL)
 + {
 +#ifdef WIN32
 + pgwin32_noblock = true;
 +#else
 + if (!pg_set_noblock(MyProcPort-sock))
 + ereport(COMMERROR,
 + (errmsg(could not set socket to 
 nonblocking mode: %m)));
 + }
 +#endif
 +

Uh. Odd. Anyway, that bit of code is now somewhere else anyway...

 One thing I did try is sending a NOTICE to the client when in
 ProcessInterrupts() and DoingCommandRead is true.  I think[1] it was
 expected to be delivered instantly, but actually the client (psql) only
 displays it after sending the next statement.

Yea, that should be psql specific though. I hope ;)

 While I'm reading on FE/BE protocol someone might want to share his
 wisdom on this subject.  My guess: psql blocks on readline/libedit call
 and can't effectively poll the server socket before complete input from
 user.

I'm not sure if it's actually a can't. It doesn't at least ;)

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2015-01-08 Thread Andres Freund

On 2014-10-03 16:26:35 +0200, Andres Freund wrote:
 On 2014-10-03 17:12:18 +0300, Heikki Linnakangas wrote:
  0002 now makes sense on its own and doesn't change anything around the
interrupt handling. Oh, and it compiles without 0003.
  
  WaitLatchOrSocket() can throw an error, so it's not totally safe to call
  that underneath OpenSSL.
 
 Hm. Fair point.

I think we should fix this by simply prohibiting
WaitLatch/WaitLatchOrSocket from ERRORing out. The easiest, and imo
acceptable, thing is to simply convert the relevant ERRORs to FATAL. I
think that'd be perfectly fine as it seems very unlikely that we
continue sanely afterwards.

It would really be nice if we had a simple way to raise a FATAL that
won't go to the client for situations like this. I'd proposed
elog(FATAL | COMERROR, ...) in the past...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-12-16 Thread Kyotaro HORIGUCHI

Hello,

 On 2014-12-15 18:19:26 +0900, Kyotaro HORIGUCHI wrote:
  Since I don't have clear idea how to promote this, I will remake
  and be back with new patch based on Andres' for patches.
 
 Do my patches miss any functionality you want?

The patch satisfies what I want, as I said upthread. What I don't
know is how I can go on with it in this CF topic, in other word
what should I do in order to put it to ready for committer?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-12-15 Thread Kyotaro HORIGUCHI

Since I don't have clear idea how to promote this, I will remake
and be back with new patch based on Andres' for patches.

 Hmm.. Sorry for my stupidity.
 
  Why is that necessary? It seems really rather wrong to make
  BIO_set_retry_write() dependant on ProcDiePending? Especially as, at
  least in my testing, it's not even required because the be_tls_write()
  can just check the error properly?
 
 I mistook the previous conversation as it doesn't work as
 expected. I confirmed that it works fine.
 
 After all, it works as I expected. The parameter for
 ProcessClientWriteInterrupt() looks somewhat uneasy but the patch
 4 looks fine as a whole. Do you have anything to worry about in
 the patch?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-12-15 Thread Andres Freund

On 2014-12-15 18:19:26 +0900, Kyotaro HORIGUCHI wrote:
 Since I don't have clear idea how to promote this, I will remake
 and be back with new patch based on Andres' for patches.

Do my patches miss any functionality you want?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-12-14 Thread Michael Paquier

On Thu, Oct 30, 2014 at 10:27 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 On 10/03/2014 06:29 PM, Heikki Linnakangas wrote:
 [blah]
 About patch 3:
 Looking closer, this design still looks OK to me. As you said yourself, the
 comments need some work (e.g. the step 5. in the top comment in async.c
 needs updating). And then there are a couple of FIXME and XXX comments that
 need to be addressed.
Those patches have not been updated in a while, and I am seeing some
feedback from several people, hence returning as returned with
feedback. Horiguchi-san, feel free to add new entries on the CF app in
2014-12 or move this entry if you feel overwise.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-11-17 Thread Alex Shulgin


Andres Freund and...@2ndquadrant.com writes:

 I've invested some more time in this:
 0002 now makes sense on its own and doesn't change anything around the
  interrupt handling. Oh, and it compiles without 0003.

In this patch, the endif appears to be misplaced in PostgresMain:

+   if (MyProcPort != NULL)
+   {
+#ifdef WIN32
+   pgwin32_noblock = true;
+#else
+   if (!pg_set_noblock(MyProcPort-sock))
+   ereport(COMMERROR,
+   (errmsg(could not set socket to 
nonblocking mode: %m)));
+   }
+#endif
+
pqinitmask();

 0003 Sinval/notify processing got simplified further. There really isn't
  any need for DisableNotifyInterrupt/DisableCatchupInterrupt
  anymore. Also begin_client_read/client_read_ended don't make much
  sense anymore. Instead introduce ProcessClientReadInterrupt (which
  wants a better name).
 There's also a very WIP
 0004 Allows secure_read/write be interrupted when ProcDiePending is
  set. All of that happens via the latch mechanism, nothing happens
  inside signal handlers. So I do think it's quite an improvement
  over what's been discussed in this thread.
  But it (and the other approaches) do noticeably increase the
  likelihood of clients not getting the error message if the client
  isn't actually dead. The likelihood of write() being blocked
  *temporarily* due to normal bandwidth constraints is quite high
  when you consider COPY FROM and similar. Right now we'll wait till
  we can put the error message into the socket afaics.

 1-3 need some serious comment work, but I think the approach is
 basically sound. I'm much, much less sure about allowing send() to be
 interrupted.

After re-reading these I don't see the rest of items I wanted to inqury
about anymore, so it just makes more sense now.

One thing I did try is sending a NOTICE to the client when in
ProcessInterrupts() and DoingCommandRead is true.  I think[1] it was
expected to be delivered instantly, but actually the client (psql) only
displays it after sending the next statement.

While I'm reading on FE/BE protocol someone might want to share his
wisdom on this subject.  My guess: psql blocks on readline/libedit call
and can't effectively poll the server socket before complete input from
user.

--
Alex

[1] http://www.postgresql.org/message-id/1262173040.19367.5015.camel@ebony

  ``AFAIK, NOTICE was suggested because it can be sent at any time,
whereas ERRORs are only associated with statements.''


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-10-30 Thread Heikki Linnakangas


On 10/03/2014 06:29 PM, Heikki Linnakangas wrote:

On 10/03/2014 05:26 PM, Andres Freund wrote:

On 2014-10-03 17:12:18 +0300, Heikki Linnakangas wrote:

On 09/28/2014 01:54 AM, Andres Freund wrote:

0003 Sinval/notify processing got simplified further. There really isn't
   any need for DisableNotifyInterrupt/DisableCatchupInterrupt
   anymore. Also begin_client_read/client_read_ended don't make much
   sense anymore. Instead introduce ProcessClientReadInterrupt (which
   wants a better name).
There's also a very WIP
0004 Allows secure_read/write be interrupted when ProcDiePending is
   set. All of that happens via the latch mechanism, nothing happens
   inside signal handlers. So I do think it's quite an improvement
   over what's been discussed in this thread.
   But it (and the other approaches) do noticeably increase the
   likelihood of clients not getting the error message if the client
   isn't actually dead. The likelihood of write() being blocked
   *temporarily* due to normal bandwidth constraints is quite high
   when you consider COPY FROM and similar. Right now we'll wait till
   we can put the error message into the socket afaics.

1-3 need some serious comment work, but I think the approach is
basically sound. I'm much, much less sure about allowing send() to be
interrupted.


Yeah, 1-3 seem sane.


I think 3 also needs a careful look. Have you looked through it? While
imo much less complex than before, there's some complex interactions in
the touched code. And we have terrible coverage of both catchup
interrupts and notify stuff...


I only looked at the .patch, I didn't apply it, so I didn't look at the
context much. But I don't see any fundamental problem with it. I would
like to have a closer look before it's committed, though.


About patch 3:

Looking closer, this design still looks OK to me. As you said yourself, 
the comments need some work (e.g. the step 5. in the top comment in 
async.c needs updating). And then there are a couple of FIXME and XXX 
comments that need to be addressed.


The comment on PGPROC.procLatch in storage/proc.h says just this:


Latch   procLatch;  /* generic latch for process */


This needs a lot more explaining. It's now used by signal handlers to 
interrupt a read or write to the socket; that should be mentioned. What 
else is it used for? (for waking up a backend in synchronous 
replication, at least) What are the rules on when to arm it and when to 
reset it?


Would it be more clear to use a separate, backend-private, latch, for 
the signals? I guess that won't work, though, because sometimes we need 
need to wait for a wakeup from a different process or from a signal at 
the same time (SyncRepWaitForLSN() in particular). Not without adding a 
variant of WaitLatch that can wait on two latches simultaneously, anyway.


The assumption in secure_raw_read that MyProc exists is pretty 
surprising. I understand why it's that way, and there's a comment in 
PostgresMain explaining why the socket cannot be put into non-blocking 
mode earlier, but it's still a bit whacky. Not sure what to do about that.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-10-30 Thread Andres Freund

On 2014-10-30 15:27:13 +0200, Heikki Linnakangas wrote:
 The comment on PGPROC.procLatch in storage/proc.h says just this:
 
  Latch   procLatch;  /* generic latch for process */
 
 This needs a lot more explaining. It's now used by signal handlers to
 interrupt a read or write to the socket; that should be mentioned. What else
 is it used for? (for waking up a backend in synchronous replication, at
 least) What are the rules on when to arm it and when to reset it?

Hm. I agree it use expaned commentary, but I'm unsure if that much
detail is really warranted. Any such documentation seems to be almost
guaranteed to be out of date. As evidenced that there's none to date...

 Would it be more clear to use a separate, backend-private, latch, for the
 signals? I guess that won't work, though, because sometimes we need need to
 wait for a wakeup from a different process or from a signal at the same time
 (SyncRepWaitForLSN() in particular). Not without adding a variant of
 WaitLatch that can wait on two latches simultaneously, anyway.

I wondered the same, but I don't really see what it'd buy us during
normal running. It seems like it'd just make code more complex without
leading to relevantly fewer wakeups.

 The assumption in secure_raw_read that MyProc exists is pretty surprising. I
 understand why it's that way, and there's a comment in PostgresMain
 explaining why the socket cannot be put into non-blocking mode earlier, but
 it's still a bit whacky. Not sure what to do about that.

It makes me quite unhappy too. I looked what it'd take to make proclatch
available earlier, but it wasn't pretty. I wondered whether we could use
a 'early proc latch' in MyProcLatch that used until we're fully attached
to shared memory. Then, when attaching to shared memory we'd set the old
latch once, and reset MyProcLatch to the shared memory one.

But that's pretty ugly.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-10-10 Thread Kyotaro HORIGUCHI

Hmm.. Sorry for my stupidity.

 Why is that necessary? It seems really rather wrong to make
 BIO_set_retry_write() dependant on ProcDiePending? Especially as, at
 least in my testing, it's not even required because the be_tls_write()
 can just check the error properly?

I mistook the previous conversation as it doesn't work as
expected. I confirmed that it works fine.

After all, it works as I expected. The parameter for
ProcessClientWriteInterrupt() looks somewhat uneasy but the patch
4 looks fine as a whole. Do you have anything to worry about in
the patch?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-10-09 Thread Andres Freund

On 2014-10-09 14:06:35 +0900, Kyotaro HORIGUCHI wrote:
 Hello, simplly inhibit set retry flag when ProcDiePending in
 my_sock_write seems enough.
 
 But it returns with SSL_ERROR_SYSCALL not SSL_ERROR_WANT_WRITE so
 I modified the patch 4 as the attached patch.

Why is that necessary? It seems really rather wrong to make
BIO_set_retry_write() dependant on ProcDiePending? Especially as, at
least in my testing, it's not even required because the be_tls_write()
can just check the error properly?

 diff --git a/src/backend/libpq/be-secure-openssl.c 
 b/src/backend/libpq/be-secure-openssl.c
 index 6fc6903..2288fe2 100644
 --- a/src/backend/libpq/be-secure-openssl.c
 +++ b/src/backend/libpq/be-secure-openssl.c
 @@ -750,7 +750,8 @@ my_sock_write(BIO *h, const char *buf, int size)
   BIO_clear_retry_flags(h);
   if (res = 0)
   {
 - if (errno == EINTR || errno == EWOULDBLOCK || errno == EAGAIN)
 + if (!ProcDiePending 
 + (errno == EINTR || errno == EWOULDBLOCK || errno == 
 EAGAIN))
   {
   BIO_set_retry_write(h);
   }


Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-10-08 Thread Kyotaro HORIGUCHI

Hello, simplly inhibit set retry flag when ProcDiePending in
my_sock_write seems enough.

But it returns with SSL_ERROR_SYSCALL not SSL_ERROR_WANT_WRITE so
I modified the patch 4 as the attached patch.

Finally, the attached patch works as expected with Andres's patch
1-3.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c
index 6fc6903..2288fe2 100644
--- a/src/backend/libpq/be-secure-openssl.c
+++ b/src/backend/libpq/be-secure-openssl.c
@@ -750,7 +750,8 @@ my_sock_write(BIO *h, const char *buf, int size)
 	BIO_clear_retry_flags(h);
 	if (res = 0)
 	{
-		if (errno == EINTR || errno == EWOULDBLOCK || errno == EAGAIN)
+		if (!ProcDiePending 
+			(errno == EINTR || errno == EWOULDBLOCK || errno == EAGAIN))
 		{
 			BIO_set_retry_write(h);
 		}
diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index 7b5b30f..ab9e122 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -199,6 +199,7 @@ secure_write(Port *port, void *ptr, size_t len)
 {
 	ssize_t		n;
 
+retry:
 #ifdef USE_SSL
 	if (port-ssl_in_use)
 	{
@@ -210,7 +211,26 @@ secure_write(Port *port, void *ptr, size_t len)
 		n = secure_raw_write(port, ptr, len);
 	}
 
-	/* XXX: We likely will want to process some interrupts here */
+	/*
+	 * We only want to process the interrupt here if socket writes are
+	 * blocking to increase the chance to get an error message to the
+	 * client. If we're not blocked there'll soon be a
+	 * CHECK_FOR_INTERRUPTS(). But if we're blocked we'll never get out of
+	 * that situation if the client has died.
+	 */
+	if (ProcDiePending  !port-noblock  n  0)
+	{
+		/*
+		 * We're dying. It's safe (and sane) to handle that now.
+		 */
+		CHECK_FOR_INTERRUPTS();
+	}
+
+	/* retry after processing interrupts */
+	if (n  0  errno == EINTR)
+	{
+		goto retry;
+	}
 
 	return n;
 }
@@ -236,10 +256,19 @@ wloop:
 		 * don't do anything while (possibly) inside a ssl library.
 		 */
 		w = WaitLatchOrSocket(MyProc-procLatch,
-			  WL_SOCKET_WRITEABLE,
+			  WL_LATCH_SET | WL_SOCKET_WRITEABLE,
 			  port-sock, 0);
 
-		if (w  WL_SOCKET_WRITEABLE)
+		if (w  WL_LATCH_SET)
+		{
+			ResetLatch(MyProc-procLatch);
+			/*
+			 * Force a return, so interrupts can be processed when not
+			 * (possibly) underneath a ssl library.
+			 */
+			errno = EINTR;
+		}
+		else if (w  WL_SOCKET_WRITEABLE)
 		{
 			goto wloop;
 		}
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 3a6aa1c..61390aa 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -520,6 +520,13 @@ ProcessClientReadInterrupt(void)
 
 		errno = save_errno;
 	}
+	else if (ProcDiePending)
+	{
+		/*
+		 * We're dying. It's safe (and sane) to handle that now.
+		 */
+		CHECK_FOR_INTERRUPTS();
+	}
 }
 
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-10-05 Thread Andres Freund

On 2014-10-03 18:29:23 +0300, Heikki Linnakangas wrote:
 On 10/03/2014 05:26 PM, Andres Freund wrote:
 On 2014-10-03 17:12:18 +0300, Heikki Linnakangas wrote:
 On 09/28/2014 01:54 AM, Andres Freund wrote:
 0003 Sinval/notify processing got simplified further. There really isn't
   any need for DisableNotifyInterrupt/DisableCatchupInterrupt
   anymore. Also begin_client_read/client_read_ended don't make much
   sense anymore. Instead introduce ProcessClientReadInterrupt (which
   wants a better name).
 There's also a very WIP
 0004 Allows secure_read/write be interrupted when ProcDiePending is
   set. All of that happens via the latch mechanism, nothing happens
   inside signal handlers. So I do think it's quite an improvement
   over what's been discussed in this thread.
   But it (and the other approaches) do noticeably increase the
   likelihood of clients not getting the error message if the client
   isn't actually dead. The likelihood of write() being blocked
   *temporarily* due to normal bandwidth constraints is quite high
   when you consider COPY FROM and similar. Right now we'll wait till
   we can put the error message into the socket afaics.
 
 1-3 need some serious comment work, but I think the approach is
 basically sound. I'm much, much less sure about allowing send() to be
 interrupted.
 
 Yeah, 1-3 seem sane.
 
 I think 3 also needs a careful look. Have you looked through it? While
 imo much less complex than before, there's some complex interactions in
 the touched code. And we have terrible coverage of both catchup
 interrupts and notify stuff...
 
 I only looked at the .patch, I didn't apply it, so I didn't look at the
 context much. But I don't see any fundamental problem with it. I would like
 to have a closer look before it's committed, though.

I'd appreciate that. I don't want to commit it without a careful review
of another committer.

 There's also the concern that using a latch for client communication
 increases the number of syscalls for the same work. We should at least
 try to quantify that...
 
 I'm not too concerned about that, since we only do extra syscalls when the
 socket isn't immediately available for reading/writing, i.e. when we have to
 sleep anyway.

Well, kernels actually do some nice optimizations for blocking reads -
at least for local sockets. Like switching to the other process
immediately and such.
I'm not super concerned either, but I think we should try to measure
it. And if we're failing, we probably should try to address these
problems - if possible in the latch code.

 4 also looks OK to me at a quick glance. It basically
 enables handling the die interrupt immediately, if we're blocked in a read
 or write. It won't be handled in the signal handler, but within the
 secure_read/write call anyway.
 
 What are you thinking about the concern that it'll reduce the likelihood
 of transferring the error message to the client? I tried to reduce that
 by only allowing errors when write() blocks, but that's not an
 infrequent event.
 
 I'm not too concerned about that either. I mean, it's probably true that it
 reduces the likelihood, but I don't particularly care myself. But if we
 care, we could use a timeout there, so that if we receive a SIGTERM while
 blocked on a send(), we wait for a few seconds to see if we can send
 whatever we were sending, before terminating the backend.
 
 
 What should we do with this patch in the commitfest?

I think feature should be ddeclare as 'returned with feedback' for this
commitfest. I've done so.

I don't see much of a reason waiting with patch 1 till the next
commitfest. It imo looks uncontroversial and doesn't have any far
reaching consequences.

 Are you planning to clean up and commit these patches?

I plan to do so one by one, yes. If you'd like to pick up any of them,
feel free to do (after telling me, to avoid duplicated efforts). I don't
feel proprietary about any of them. But I guess you have other stuff
you'd like to work on too ;)

I'll try to send out a version with the stuff you mentioned earlier in
the next couple days.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-10-03 Thread Heikki Linnakangas


On 09/28/2014 01:54 AM, Andres Freund wrote:

I've invested some more time in this:


Thanks!

In 0001, the select() codepath will not return (WL_SOCKET_READABLE | 
WL_SOCKET_WRITEABLE) on EOF or error, like the comment says and like the 
poll() path does. It only sets WL_SOCKET_READABLE if WL_SOCKET_READABLE 
was requested as a wake-event, and likewise for writeable, while the 
poll() codepath returns (WL_SOCKET_READABLE | WL_SOCKET_WRITEABLE) 
regardless of the requested wake-events. I'm not sure which is actually 
better - a separate WL_SOCKET_ERROR code might be best - but it's 
inconsistent as it is.



0002 now makes sense on its own and doesn't change anything around the
  interrupt handling. Oh, and it compiles without 0003.


WaitLatchOrSocket() can throw an error, so it's not totally safe to call 
that underneath OpenSSL. Admittedly the cases where it throws an error 
are shouldn't happen cases like poll() failed or read() on 
self-pipe failed, but still. Perhaps those errors should be 
reclassified as FATAL; it's not clear you can just roll back and expect 
to continue running if any of them happens.


In secure_raw_write(), you need to save and restore errno, as 
WaitLatchOrSocket will not preserve it. If secure_raw_write() calls 
WaitLatchOrSocket(), and it returns because the latch was set, and we 
fall out of secure_raw_write, we will return -1 but the errno might not 
be set to anything sensible anymore.



0003 Sinval/notify processing got simplified further. There really isn't
  any need for DisableNotifyInterrupt/DisableCatchupInterrupt
  anymore. Also begin_client_read/client_read_ended don't make much
  sense anymore. Instead introduce ProcessClientReadInterrupt (which
  wants a better name).
There's also a very WIP
0004 Allows secure_read/write be interrupted when ProcDiePending is
  set. All of that happens via the latch mechanism, nothing happens
  inside signal handlers. So I do think it's quite an improvement
  over what's been discussed in this thread.
  But it (and the other approaches) do noticeably increase the
  likelihood of clients not getting the error message if the client
  isn't actually dead. The likelihood of write() being blocked
  *temporarily* due to normal bandwidth constraints is quite high
  when you consider COPY FROM and similar. Right now we'll wait till
  we can put the error message into the socket afaics.

1-3 need some serious comment work, but I think the approach is
basically sound. I'm much, much less sure about allowing send() to be
interrupted.


Yeah, 1-3 seem sane. 4 also looks OK to me at a quick glance. It 
basically enables handling the die interrupt immediately, if we're 
blocked in a read or write. It won't be handled in the signal handler, 
but within the secure_read/write call anyway.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-10-03 Thread Andres Freund

Hi,

On 2014-10-03 17:12:18 +0300, Heikki Linnakangas wrote:
 On 09/28/2014 01:54 AM, Andres Freund wrote:
 I've invested some more time in this:
 
 Thanks!
 
 In 0001, the select() codepath will not return (WL_SOCKET_READABLE |
 WL_SOCKET_WRITEABLE) on EOF or error, like the comment says and like the
 poll() path does. It only sets WL_SOCKET_READABLE if WL_SOCKET_READABLE was
 requested as a wake-event, and likewise for writeable, while the poll()
 codepath returns (WL_SOCKET_READABLE | WL_SOCKET_WRITEABLE) regardless of
 the requested wake-events. I'm not sure which is actually better - a
 separate WL_SOCKET_ERROR code might be best - but it's inconsistent as it
 is.

Hm. Right. I think we should only report the requested state. We can't
really discern wether it's a hangup, error or actually readable/writable
with select() - it just returns the socket as readable/writable as soon
as it doesn't block anymore. Where not blocking includes the connection
having gone bad.

It took me a while to figure out whether that's actually guaranteed by
the spec, but I'm pretty sure it is...

 0002 now makes sense on its own and doesn't change anything around the
   interrupt handling. Oh, and it compiles without 0003.
 
 WaitLatchOrSocket() can throw an error, so it's not totally safe to call
 that underneath OpenSSL.

Hm. Fair point.

 Admittedly the cases where it throws an error are
 shouldn't happen cases like poll() failed or read() on self-pipe
 failed, but still. Perhaps those errors should be reclassified as FATAL;
 it's not clear you can just roll back and expect to continue running if any
 of them happens.

Fine with me.

 In secure_raw_write(), you need to save and restore errno, as
 WaitLatchOrSocket will not preserve it. If secure_raw_write() calls
 WaitLatchOrSocket(), and it returns because the latch was set, and we fall
 out of secure_raw_write, we will return -1 but the errno might not be set to
 anything sensible anymore.

Oops.

 0003 Sinval/notify processing got simplified further. There really isn't
   any need for DisableNotifyInterrupt/DisableCatchupInterrupt
   anymore. Also begin_client_read/client_read_ended don't make much
   sense anymore. Instead introduce ProcessClientReadInterrupt (which
   wants a better name).
 There's also a very WIP
 0004 Allows secure_read/write be interrupted when ProcDiePending is
   set. All of that happens via the latch mechanism, nothing happens
   inside signal handlers. So I do think it's quite an improvement
   over what's been discussed in this thread.
   But it (and the other approaches) do noticeably increase the
   likelihood of clients not getting the error message if the client
   isn't actually dead. The likelihood of write() being blocked
   *temporarily* due to normal bandwidth constraints is quite high
   when you consider COPY FROM and similar. Right now we'll wait till
   we can put the error message into the socket afaics.
 
 1-3 need some serious comment work, but I think the approach is
 basically sound. I'm much, much less sure about allowing send() to be
 interrupted.
 
 Yeah, 1-3 seem sane.

I think 3 also needs a careful look. Have you looked through it? While
imo much less complex than before, there's some complex interactions in
the touched code. And we have terrible coverage of both catchup
interrupts and notify stuff...

Tom, do you happen to have time to look at that bit?


There's also the concern that using a latch for client communication
increases the number of syscalls for the same work. We should at least
try to quantify that...

 4 also looks OK to me at a quick glance. It basically
 enables handling the die interrupt immediately, if we're blocked in a read
 or write. It won't be handled in the signal handler, but within the
 secure_read/write call anyway.

What are you thinking about the concern that it'll reduce the likelihood
of transferring the error message to the client? I tried to reduce that
by only allowing errors when write() blocks, but that's not an
infrequent event.

Thanks for looking.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-10-03 Thread Heikki Linnakangas


On 10/03/2014 05:26 PM, Andres Freund wrote:

On 2014-10-03 17:12:18 +0300, Heikki Linnakangas wrote:

On 09/28/2014 01:54 AM, Andres Freund wrote:

0003 Sinval/notify processing got simplified further. There really isn't
  any need for DisableNotifyInterrupt/DisableCatchupInterrupt
  anymore. Also begin_client_read/client_read_ended don't make much
  sense anymore. Instead introduce ProcessClientReadInterrupt (which
  wants a better name).
There's also a very WIP
0004 Allows secure_read/write be interrupted when ProcDiePending is
  set. All of that happens via the latch mechanism, nothing happens
  inside signal handlers. So I do think it's quite an improvement
  over what's been discussed in this thread.
  But it (and the other approaches) do noticeably increase the
  likelihood of clients not getting the error message if the client
  isn't actually dead. The likelihood of write() being blocked
  *temporarily* due to normal bandwidth constraints is quite high
  when you consider COPY FROM and similar. Right now we'll wait till
  we can put the error message into the socket afaics.

1-3 need some serious comment work, but I think the approach is
basically sound. I'm much, much less sure about allowing send() to be
interrupted.


Yeah, 1-3 seem sane.


I think 3 also needs a careful look. Have you looked through it? While
imo much less complex than before, there's some complex interactions in
the touched code. And we have terrible coverage of both catchup
interrupts and notify stuff...


I only looked at the .patch, I didn't apply it, so I didn't look at the 
context much. But I don't see any fundamental problem with it. I would 
like to have a closer look before it's committed, though.



There's also the concern that using a latch for client communication
increases the number of syscalls for the same work. We should at least
try to quantify that...


I'm not too concerned about that, since we only do extra syscalls when 
the socket isn't immediately available for reading/writing, i.e. when we 
have to sleep anyway.



4 also looks OK to me at a quick glance. It basically
enables handling the die interrupt immediately, if we're blocked in a read
or write. It won't be handled in the signal handler, but within the
secure_read/write call anyway.


What are you thinking about the concern that it'll reduce the likelihood
of transferring the error message to the client? I tried to reduce that
by only allowing errors when write() blocks, but that's not an
infrequent event.


I'm not too concerned about that either. I mean, it's probably true that 
it reduces the likelihood, but I don't particularly care myself. But if 
we care, we could use a timeout there, so that if we receive a SIGTERM 
while blocked on a send(), we wait for a few seconds to see if we can 
send whatever we were sending, before terminating the backend.



What should we do with this patch in the commitfest? Are you planning to 
clean up and commit these patches?


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-10-02 Thread Kyotaro HORIGUCHI

  Sorry, I missed this message and only cought up when reading your CF
  status mail. I've attached three patches:
 
  Could let me know how to get the CF status mail?
 
 I think he meant this email I sent last weekend:
 
 http://www.postgresql.org/message-id/542672d2.3060...@vmware.com

I see, that's what I also received. Thank you.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-10-02 Thread Kyotaro HORIGUCHI

Hello,

  I propose the attached patch. It adds a new flag ImmediateDieOK, which is a
  weaker form of ImmediateInterruptOK that only allows handling a pending
  die-signal in the signal handler.
  
  Robert, others, do you see a problem with this?
 
 Per se I don't have a problem with it. There does exist the problem that
 the user doesn't get a error message in more cases though. On the other
 hand it's bad if any user can prevent the database from restarting.
 
  Over IM, Robert pointed out that it's not safe to jump out of a signal
  handler with siglongjmp, when we're inside library calls, like in a callback
  called by OpenSSL. But even with current master branch, that's exactly what
  we do. In secure_raw_read(), we set ImmediateInterruptOK = true, which means
  that any incoming signal will be handled directly in the signal handler,
  which can mean elog(ERROR). Should we be worried? OpenSSL might get confused
  if control never returns to the SSL_read() or SSL_write() function that
  called secure_raw_read().
 
 But this is imo prohibitive. Yes, we're doing it for a long while. But
 no, that's not ok. It actually prompoted me into prototyping the latch
 thing (in some other thread). I don't think existing practice justifies
 expanding it further.

I see, in that case, this approach seems basically
applicable. But if I understand correctly, this patch seems not
to return out of the openssl code even when latch was found to be
set in secure_raw_write/read.  I tried setting errno = ECONNRESET
and it went well but seems a bad deed.

secure_raw_write(Port *port, const void *ptr, size_t len)
{
  n = send(port-sock, ptr, len, 0);
  
  if (!port-noblock  n  0  (errno == EWOULDBLOCK || errno == EAGAIN))
  {
w = WaitLatchOrSocket(MyProc-procLatch, ...

if (w  WL_LATCH_SET)
{
ResetLatch(MyProc-procLatch);
/*
* Force a return, so interrupts can be processed when not
* (possibly) underneath a ssl library.
*/
errno = EINTR;
(return n;  // n is negative)


my_sock_write(BIO *h, const char *buf, int size)
{
  res = secure_raw_write(((Port *) h-ptr), buf, size);
  BIO_clear_retry_flags(h);
  if (res = 0)
  {
if (errno == EINTR || errno == EWOULDBLOCK || errno == EAGAIN)
{
  BIO_set_retry_write(h);
  

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-10-02 Thread Andres Freund

On 2014-10-02 17:47:39 +0900, Kyotaro HORIGUCHI wrote:
 Hello,
 
   I propose the attached patch. It adds a new flag ImmediateDieOK, which is 
   a
   weaker form of ImmediateInterruptOK that only allows handling a pending
   die-signal in the signal handler.
   
   Robert, others, do you see a problem with this?
  
  Per se I don't have a problem with it. There does exist the problem that
  the user doesn't get a error message in more cases though. On the other
  hand it's bad if any user can prevent the database from restarting.
  
   Over IM, Robert pointed out that it's not safe to jump out of a signal
   handler with siglongjmp, when we're inside library calls, like in a 
   callback
   called by OpenSSL. But even with current master branch, that's exactly 
   what
   we do. In secure_raw_read(), we set ImmediateInterruptOK = true, which 
   means
   that any incoming signal will be handled directly in the signal handler,
   which can mean elog(ERROR). Should we be worried? OpenSSL might get 
   confused
   if control never returns to the SSL_read() or SSL_write() function that
   called secure_raw_read().
  
  But this is imo prohibitive. Yes, we're doing it for a long while. But
  no, that's not ok. It actually prompoted me into prototyping the latch
  thing (in some other thread). I don't think existing practice justifies
  expanding it further.
 
 I see, in that case, this approach seems basically
 applicable. But if I understand correctly, this patch seems not
 to return out of the openssl code even when latch was found to be
 set in secure_raw_write/read.

Correct. That's why I think it's the way forward. There's several
problems now where the inability to do real things while reading/writing
is a problem.

  I tried setting errno = ECONNRESET
 and it went well but seems a bad deed.

Where and why did you do that?

 secure_raw_write(Port *port, const void *ptr, size_t len)
 {
   n = send(port-sock, ptr, len, 0);
   
   if (!port-noblock  n  0  (errno == EWOULDBLOCK || errno == EAGAIN))
   {
 w = WaitLatchOrSocket(MyProc-procLatch, ...
 
 if (w  WL_LATCH_SET)
 {
   ResetLatch(MyProc-procLatch);
 /*
 * Force a return, so interrupts can be processed when not
 * (possibly) underneath a ssl library.
 */
 errno = EINTR;
 (return n;  // n is negative)
 
 
 my_sock_write(BIO *h, const char *buf, int size)
 {
   res = secure_raw_write(((Port *) h-ptr), buf, size);
   BIO_clear_retry_flags(h);
   if (res = 0)
   {
 if (errno == EINTR || errno == EWOULDBLOCK || errno == EAGAIN)
 {
   BIO_set_retry_write(h);

Hm, this seems, besides one comment, the code from the last patch in my
series. Do you have a particular question about it?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-10-02 Thread Kyotaro HORIGUCHI

Hi,

   But this is imo prohibitive. Yes, we're doing it for a long while. But
   no, that's not ok. It actually prompoted me into prototyping the latch
   thing (in some other thread). I don't think existing practice justifies
   expanding it further.
  
  I see, in that case, this approach seems basically
  applicable. But if I understand correctly, this patch seems not
  to return out of the openssl code even when latch was found to be
  set in secure_raw_write/read.
 
 Correct. That's why I think it's the way forward. There's several
 problems now where the inability to do real things while reading/writing
 is a problem.
 
   I tried setting errno = ECONNRESET
  and it went well but seems a bad deed.
 
 Where and why did you do that?

The patch of this message.

http://www.postgresql.org/message-id/20140828.214704.93968088.horiguchi.kyot...@lab.ntt.co.jp

The reason for setting errno (instead of a variable for it) is to
trick openssl (or my_socck_write? I've forgot it..) into
recognizing as if the underneath send(2) have returned with any
uncontinueable error so it cannot be any of continueable errnos
(EINTR/EWOULDBLOCK/EAGAIN). Iy my faint memory, only avoiding
BIO_set_retry_write() in my_sock_write() dosn't work as expected
but it might be enough that my_sock_write returns -1 and doesn't
set BIO_set_retry_write().

The reason why ECONNNRESET is any of other errnos possible for
send(2)(*1) doesn't seem to fit the real situation, and the
blocked situation seems similar to resetted connection from the
view that it cannot continue to work due to external condition,
and it is used in be_tls_write() in a similary way.

Come to think of it, setting ECONNRESET is not so evil?


  secure_raw_write(Port *port, const void *ptr, size_t len)
  {
n = send(port-sock, ptr, len, 0);

if (!port-noblock  n  0  (errno == EWOULDBLOCK || errno == EAGAIN))
{
  w = WaitLatchOrSocket(MyProc-procLatch, ...
  
  if (w  WL_LATCH_SET)
  {
  ResetLatch(MyProc-procLatch);
  /*
  * Force a return, so interrupts can be processed when not
  * (possibly) underneath a ssl library.
  */
  errno = EINTR;
  (return n;  // n is negative)
  
  
  my_sock_write(BIO *h, const char *buf, int size)
  {
res = secure_raw_write(((Port *) h-ptr), buf, size);
BIO_clear_retry_flags(h);
if (res = 0)
{
  if (errno == EINTR || errno == EWOULDBLOCK || errno == EAGAIN)
  {
BIO_set_retry_write(h);
 
 Hm, this seems, besides one comment, the code from the last patch in my
 series. Do you have a particular question about it?

I didn't have a particluar qustion about it. This is cited only
in order to show the route to retrying.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-30 Thread Kyotaro HORIGUCHI

Thank you for reviewing. I'll look close to the patch tomorrow.

 I must say this scares the heck out of me. The current code goes
 through some trouble to not throw an error while in a recv()
 send(). For example, you removed the DoingCommandRead check from
 prepare_for_client_read(). There's an comment in postgres.c that says
 this:
 
  /*
   * (2) Allow asynchronous signals to be executed immediately
   * if they
   * come in while we are waiting for client input. (This must
   * be
   * conditional since we don't want, say, reads on behalf of
   * COPY FROM
   * STDIN doing the same thing.)
   */
  DoingCommandRead = true;

Hmm. Sorry. That's my fault that I skipped over the issues about
COPY FROM STDIN.

 With the patch, we do allow asynchronous signals to be processed while
 blocked in COPY FROM STDIN. Maybe that's OK, but I don't feel
 comfortable just changing it. (the comment is now wrong, of course)

I don't see actual problem but I agree that the behavior should
not be chenged.

 This patch also enables processing query cancel signals while blocked,
 not just SIGTERM. That's not good; we might be in the middle of
 sending a message, and we cannot just error out of that or we might
 violate the fe/be protocol. That's OK with a SIGTERM as you're
 terminating the connection anyway, and we have the PqCommBusy
 safeguard in place that prevents us from sending broken messages to
 the client, but that's not good enough if we wanted to keep the
 backend alive, as we won't be able to send anything to the client
 anymore.

Ok, since what I want is escaping from blocked send() only by
SIGTERM, it needs another mechanism from current
prepare_for_client_read().

 BTW, we've been talking about blocking in send(), but this patch also
 let's a recv() in e.g. COPY FROM STDIN to be interrupted. That's
 probably a good thing; surely you have exactly the same issues with
 that as with send(). But I didn't realize we had a problem with that
 too.

I see. (But it is mere a side-effect of my carelessness, as you know:)

 In summary, this patch is not ready as it is, but I think we can fix
 it. The key question is: is it safe to handle SIGTERM in the signal
 handler, calling the exit-handlers and exiting the backend, when
 blocked in a recv() or send()? It's a change in the pqcomm.c API; most
 pqcomm.c functions have not thrown errors or processed interrupts
 before. But looking at the callers, I think it's safe, and there isn't
 actually any comments explicitly saying that pqcomm.c will never throw
 errors.
 
 I propose the attached patch. It adds a new flag ImmediateDieOK, which
 is a weaker form of ImmediateInterruptOK that only allows handling a
 pending die-signal in the signal handler.
 
 Robert, others, do you see a problem with this?

The patch seems excluding all problems menthioned in the message,
I have no objection to it.

 Over IM, Robert pointed out that it's not safe to jump out of a signal
 handler with siglongjmp, when we're inside library calls, like in a
 callback called by OpenSSL. But even with current master branch,
 that's exactly what we do. In secure_raw_read(), we set
 ImmediateInterruptOK = true, which means that any incoming signal will
 be handled directly in the signal handler, which can mean
 elog(ERROR). Should we be worried? OpenSSL might get confused if
 control never returns to the SSL_read() or SSL_write() function that
 called secure_raw_read().

IMHO, it will soon die even if OpenSSL is confused. It seems a
bit brute that sudden cutoff occurs even when the socket is *not*
blocked, but the backend will soon die and frontend will
immediately get ECONNRESET (..hmm it is not seen in manpages of
recv/read(2)) and should safely exit from OpenSSL.

I cannot run this patch right now, but it seems to be no problem.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-30 Thread Kyotaro HORIGUCHI

Wow, thank you for the patch.

   0001: Allows WaitLatchOrSocket(WL_WRITABLE) without WL_READABLE. I've
 tested the poll() and select() implementations on linux and
 blindly patched up windows.
   0002: Put the socket the backend uses to communicate with the client
 into nonblocking mode as soon as latches are ready and use latches
 to wait. This probably doesn't work correctly without 0003, but
 seems easier to review separately.
   0003: Don't do sinval catchup and notify processing in signal
 handlers. It's quite cool that it worked that well so far, but it
 requires some complicated code and is rather fragile. 0002 allows
 to move that out of signal handlers and just use a latch
 there. This seems remarkably simpler:
  4 files changed, 69 insertions(+), 229 deletions(-)
   
   These aren't ready for commit, especially not 0003, but I think they are
   quite a good foundation for getting rid of the blocking in send(). I
   haven't added any interrupt processing after interrupted writes, but
   marked the relevant places with XXXs.
   
   With regard to 0002, I dislike the current need to do interrupt
   processing both in be-secure.c and be-secure-openssl.c. I guess we could
   solve that by returning something like EINTR from the ssl routines when
   they need further reads/writes and do all the processing in one place in
   be-secure.c.
   
   There's also some cleanup in 0002/0003 needed:
   prepare_for_client_read()/client_read_ended() aren't needed in that form
   anymore and should probably rather be something like
   CHECK_FOR_READ_INTERRUPT() or similar.  Similarly the
   EnableCatchupInterrupt()/DisableCatchupInterrupt() in autovacuum.c is
   pretty ugly.
   
   Btw, be-secure.c is really not a good name anymore...
   
   What do you think?
  
  I've invested some more time in this:
  0002 now makes sense on its own and doesn't change anything around the
   interrupt handling. Oh, and it compiles without 0003.
  0003 Sinval/notify processing got simplified further. There really isn't
   any need for DisableNotifyInterrupt/DisableCatchupInterrupt
   anymore. Also begin_client_read/client_read_ended don't make much
   sense anymore. Instead introduce ProcessClientReadInterrupt (which
   wants a better name).
  There's also a very WIP
  0004 Allows secure_read/write be interrupted when ProcDiePending is
   set. All of that happens via the latch mechanism, nothing happens
   inside signal handlers. So I do think it's quite an improvement
   over what's been discussed in this thread.
   But it (and the other approaches) do noticeably increase the
   likelihood of clients not getting the error message if the client
   isn't actually dead. The likelihood of write() being blocked
   *temporarily* due to normal bandwidth constraints is quite high
   when you consider COPY FROM and similar. Right now we'll wait till
   we can put the error message into the socket afaics.
  
  1-3 need some serious comment work, but I think the approach is
  basically sound. I'm much, much less sure about allowing send() to be
  interrupted.
 
 Kyatoro, could you check whether you can achieve what you want using
 0004?
 
 It's imo pretty clear that a fair amount of base work needs to be done
 and there's been a fair amount of progress made this fest. I think this
 can now be marked returned with feedback.

Myself is satisfied by Heikki's solution, and it seems ready for
commit. But I agree with the temporarily blocked state is seen
often and it breaks even non-blocked socket. If we want to/should
avoid breaking *temporarily or not* blocked socket even for
SIGTERM, this mechanism should be used.

Which way should we take?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-30 Thread Kyotaro HORIGUCHI

By the way,

 Sorry, I missed this message and only cought up when reading your CF
 status mail. I've attached three patches:

Could let me know how to get the CF status mail?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-30 Thread Andres Freund

On 2014-09-26 21:02:16 +0300, Heikki Linnakangas wrote:
 I propose the attached patch. It adds a new flag ImmediateDieOK, which is a
 weaker form of ImmediateInterruptOK that only allows handling a pending
 die-signal in the signal handler.
 
 Robert, others, do you see a problem with this?

Per se I don't have a problem with it. There does exist the problem that
the user doesn't get a error message in more cases though. On the other
hand it's bad if any user can prevent the database from restarting.

 Over IM, Robert pointed out that it's not safe to jump out of a signal
 handler with siglongjmp, when we're inside library calls, like in a callback
 called by OpenSSL. But even with current master branch, that's exactly what
 we do. In secure_raw_read(), we set ImmediateInterruptOK = true, which means
 that any incoming signal will be handled directly in the signal handler,
 which can mean elog(ERROR). Should we be worried? OpenSSL might get confused
 if control never returns to the SSL_read() or SSL_write() function that
 called secure_raw_read().

But this is imo prohibitive. Yes, we're doing it for a long while. But
no, that's not ok. It actually prompoted me into prototyping the latch
thing (in some other thread). I don't think existing practice justifies
expanding it further.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-29 Thread Andres Freund

On 2014-09-28 00:54:21 +0200, Andres Freund wrote:
 On 2014-09-27 21:12:43 +0200, Andres Freund wrote:
  On 2014-09-03 15:09:54 +0300, Heikki Linnakangas wrote:
  Sorry, I missed this message and only cought up when reading your CF
  status mail. I've attached three patches:
  
  0001: Allows WaitLatchOrSocket(WL_WRITABLE) without WL_READABLE. I've
tested the poll() and select() implementations on linux and
blindly patched up windows.
  0002: Put the socket the backend uses to communicate with the client
into nonblocking mode as soon as latches are ready and use latches
to wait. This probably doesn't work correctly without 0003, but
seems easier to review separately.
  0003: Don't do sinval catchup and notify processing in signal
handlers. It's quite cool that it worked that well so far, but it
requires some complicated code and is rather fragile. 0002 allows
to move that out of signal handlers and just use a latch
there. This seems remarkably simpler:
 4 files changed, 69 insertions(+), 229 deletions(-)
  
  These aren't ready for commit, especially not 0003, but I think they are
  quite a good foundation for getting rid of the blocking in send(). I
  haven't added any interrupt processing after interrupted writes, but
  marked the relevant places with XXXs.
  
  With regard to 0002, I dislike the current need to do interrupt
  processing both in be-secure.c and be-secure-openssl.c. I guess we could
  solve that by returning something like EINTR from the ssl routines when
  they need further reads/writes and do all the processing in one place in
  be-secure.c.
  
  There's also some cleanup in 0002/0003 needed:
  prepare_for_client_read()/client_read_ended() aren't needed in that form
  anymore and should probably rather be something like
  CHECK_FOR_READ_INTERRUPT() or similar.  Similarly the
  EnableCatchupInterrupt()/DisableCatchupInterrupt() in autovacuum.c is
  pretty ugly.
  
  Btw, be-secure.c is really not a good name anymore...
  
  What do you think?
 
 I've invested some more time in this:
 0002 now makes sense on its own and doesn't change anything around the
  interrupt handling. Oh, and it compiles without 0003.
 0003 Sinval/notify processing got simplified further. There really isn't
  any need for DisableNotifyInterrupt/DisableCatchupInterrupt
  anymore. Also begin_client_read/client_read_ended don't make much
  sense anymore. Instead introduce ProcessClientReadInterrupt (which
  wants a better name).
 There's also a very WIP
 0004 Allows secure_read/write be interrupted when ProcDiePending is
  set. All of that happens via the latch mechanism, nothing happens
  inside signal handlers. So I do think it's quite an improvement
  over what's been discussed in this thread.
  But it (and the other approaches) do noticeably increase the
  likelihood of clients not getting the error message if the client
  isn't actually dead. The likelihood of write() being blocked
  *temporarily* due to normal bandwidth constraints is quite high
  when you consider COPY FROM and similar. Right now we'll wait till
  we can put the error message into the socket afaics.
 
 1-3 need some serious comment work, but I think the approach is
 basically sound. I'm much, much less sure about allowing send() to be
 interrupted.

Kyatoro, could you check whether you can achieve what you want using
0004?

It's imo pretty clear that a fair amount of base work needs to be done
and there's been a fair amount of progress made this fest. I think this
can now be marked returned with feedback.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-27 Thread Andres Freund

On 2014-09-03 15:09:54 +0300, Heikki Linnakangas wrote:
 On 09/03/2014 12:23 AM, Andres Freund wrote:
 On 2014-09-02 17:21:03 -0400, Tom Lane wrote:
 Heikki Linnakangas hlinnakan...@vmware.com writes:
 I was going to suggest using WaitLatchOrSocket instead of sleeping in 1
 second increment, but I see that WaitLatchOrSocket() doesn't currently
 support waiting for a socket to become writeable, without also waiting
 for it to become readable. I wonder how difficult it would be to lift
 that restriction.
 
 My recollection is that there was a reason for that, but I don't recall
 details any more.
 
 http://git.postgresql.org/pg/commitdiff/e42a21b9e6c9b9e6346a34b62628d48ff2fc6ddf
 
 In my prototype I've changed the API that errors set both
 READABLE/WRITABLE. Seems to work
 
 Andres, would you mind posting the WIP patch you have? That could be a
 better foundation for this patch.

Sorry, I missed this message and only cought up when reading your CF
status mail. I've attached three patches:

0001: Allows WaitLatchOrSocket(WL_WRITABLE) without WL_READABLE. I've
  tested the poll() and select() implementations on linux and
  blindly patched up windows.
0002: Put the socket the backend uses to communicate with the client
  into nonblocking mode as soon as latches are ready and use latches
  to wait. This probably doesn't work correctly without 0003, but
  seems easier to review separately.
0003: Don't do sinval catchup and notify processing in signal
  handlers. It's quite cool that it worked that well so far, but it
  requires some complicated code and is rather fragile. 0002 allows
  to move that out of signal handlers and just use a latch
  there. This seems remarkably simpler:
   4 files changed, 69 insertions(+), 229 deletions(-)

These aren't ready for commit, especially not 0003, but I think they are
quite a good foundation for getting rid of the blocking in send(). I
haven't added any interrupt processing after interrupted writes, but
marked the relevant places with XXXs.

With regard to 0002, I dislike the current need to do interrupt
processing both in be-secure.c and be-secure-openssl.c. I guess we could
solve that by returning something like EINTR from the ssl routines when
they need further reads/writes and do all the processing in one place in
be-secure.c.

There's also some cleanup in 0002/0003 needed:
prepare_for_client_read()/client_read_ended() aren't needed in that form
anymore and should probably rather be something like
CHECK_FOR_READ_INTERRUPT() or similar.  Similarly the
EnableCatchupInterrupt()/DisableCatchupInterrupt() in autovacuum.c is
pretty ugly.

Btw, be-secure.c is really not a good name anymore...

What do you think?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-26 Thread Heikki Linnakangas


On 09/09/2014 01:31 PM, Kyotaro HORIGUCHI wrote:

Hi, I added and edited some comments.


Sorry, It tha patch contains a silly bug. Please find the
attatched one.


I must say this scares the heck out of me. The current code goes through 
some trouble to not throw an error while in a recv() send(). For 
example, you removed the DoingCommandRead check from 
prepare_for_client_read(). There's an comment in postgres.c that says this:



/*
 * (2) Allow asynchronous signals to be executed immediately if 
they
 * come in while we are waiting for client input. (This must be
 * conditional since we don't want, say, reads on behalf of 
COPY FROM
 * STDIN doing the same thing.)
 */
DoingCommandRead = true;


With the patch, we do allow asynchronous signals to be processed while 
blocked in COPY FROM STDIN. Maybe that's OK, but I don't feel 
comfortable just changing it. (the comment is now wrong, of course)


This patch also enables processing query cancel signals while blocked, 
not just SIGTERM. That's not good; we might be in the middle of sending 
a message, and we cannot just error out of that or we might violate the 
fe/be protocol. That's OK with a SIGTERM as you're terminating the 
connection anyway, and we have the PqCommBusy safeguard in place that 
prevents us from sending broken messages to the client, but that's not 
good enough if we wanted to keep the backend alive, as we won't be able 
to send anything to the client anymore.


BTW, we've been talking about blocking in send(), but this patch also 
let's a recv() in e.g. COPY FROM STDIN to be interrupted. That's 
probably a good thing; surely you have exactly the same issues with that 
as with send(). But I didn't realize we had a problem with that too.



In summary, this patch is not ready as it is, but I think we can fix it. 
The key question is: is it safe to handle SIGTERM in the signal handler, 
calling the exit-handlers and exiting the backend, when blocked in a 
recv() or send()? It's a change in the pqcomm.c API; most pqcomm.c 
functions have not thrown errors or processed interrupts before. But 
looking at the callers, I think it's safe, and there isn't actually any 
comments explicitly saying that pqcomm.c will never throw errors.


I propose the attached patch. It adds a new flag ImmediateDieOK, which 
is a weaker form of ImmediateInterruptOK that only allows handling a 
pending die-signal in the signal handler.


Robert, others, do you see a problem with this?

Over IM, Robert pointed out that it's not safe to jump out of a signal 
handler with siglongjmp, when we're inside library calls, like in a 
callback called by OpenSSL. But even with current master branch, that's 
exactly what we do. In secure_raw_read(), we set ImmediateInterruptOK = 
true, which means that any incoming signal will be handled directly in 
the signal handler, which can mean elog(ERROR). Should we be worried? 
OpenSSL might get confused if control never returns to the SSL_read() or 
SSL_write() function that called secure_raw_read().


- Heikki

diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index 41ec1ad..049e5b1 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -178,5 +178,13 @@ secure_write(Port *port, void *ptr, size_t len)
 ssize_t
 secure_raw_write(Port *port, const void *ptr, size_t len)
 {
-	return send(port-sock, ptr, len, 0);
+	ssize_t result;
+
+	prepare_for_client_write();
+
+	result = send(port-sock, ptr, len, 0);
+
+	client_write_ended();
+
+	return result;
 }
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 61f17bf..138060b 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -497,6 +497,17 @@ ReadCommand(StringInfo inBuf)
  * non-reentrant libc functions.  This restriction makes it safe for us
  * to allow interrupt service routines to execute nontrivial code while
  * we are waiting for input.
+ *
+ * When waiting in the main loop, we can process any interrupt immediately
+ * in the signal handler. In any other read from the client, like in a COPY
+ * FROM STDIN, we can't safely process a query cancel signal, because we might
+ * be in the middle of sending a message to the client, and jumping out would
+ * violate the protocol. Or rather, pqcomm.c would detect it and refuse to
+ * send any more messages to the client. But handling a SIGTERM is OK, because
+ * we're terminating the backend and don't need to send any more messages
+ * anyway. That means that we might not be able to send an error message to
+ * the client, but that seems better than waiting indefinitely, in case the
+ * client is not responding.
  */
 void
 prepare_for_client_read(void)
@@ -513,6 +524,15 @@ prepare_for_client_read(void)
 		/* And don't forget to detect one that already arrived */
 		CHECK_FOR_INTERRUPTS();
 	}
+	else
+	{
+		/* Allow die

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-17 Thread Kyotaro HORIGUCHI

Sorry for the mistake...

At Wed, 10 Sep 2014 18:53:03 +0300, Heikki Linnakangas 
hlinnakan...@vmware.com wrote in 541073df.70...@vmware.com
 Wrong thread...
 
 On 09/10/2014 03:04 AM, Kyotaro HORIGUCHI wrote:
  Hmm. Sorry, I misunderstood the specification.
 
  You approach that coloring tokens seems right, but you have
  broken the parse logic by adding your code.
 
  Other than the mistakes others pointed, I found that
 
  - non-SQL-ident like tokens are ignored by their token style,
 quoted or not, so the following line works.
 
  | local All aLL trust

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-10 Thread Heikki Linnakangas


Wrong thread...

On 09/10/2014 03:04 AM, Kyotaro HORIGUCHI wrote:

Hmm. Sorry, I misunderstood the specification.


You approach that coloring tokens seems right, but you have
broken the parse logic by adding your code.

Other than the mistakes others pointed, I found that

- non-SQL-ident like tokens are ignored by their token style,
   quoted or not, so the following line works.

| local All aLL trust

I suppose this is not what you intended. This is because you have
igonred the attribute of a token when comparing it as
non-SQL-ident tokens.


- '+' at the head of the sequence '+' is treated as the first
   character of the *quoted* string. e.g. +hoge is tokenized as
   +hoge:special_quoted.


I found this is what intended. This should be documented as
comments.

|2) users and user-groups only requires special handling and behavior as follows
|Normal user :
|  A. unquoted ( USER ) will be treated as user ( downcase ).
|  B. quoted  ( USeR )  will be treated as USeR (case-sensitive).
|  C. quoted ( +USER ) will be treated as normal user +USER (i.e. will 
not be considered as user-group) and case-sensitive as string is quoted.

This seems confising with the B below. This seems should be
rearranged.

|   User Group :
|  A. unquoted ( +USERGROUP ) will be treated as +usergruop ( downcase ).
|  B. plus quoted ( +UserGROUP  ) will be treated as +UserGROUP 
(case-sensitive).




This is why you simply continued processing for '+' without
discarding and skipping the '+', and not setting in_quote so the
following parser code works as it is not intended. You should
understand what the original code does and insert or modify
logics not braeking the assumptions.


regards,




--
- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-09 Thread Kyotaro HORIGUCHI

Hi, I added and edited some comments.

 Sorry, It tha patch contains a silly bug. Please find the
 attatched one.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From eb91a7c91e1fd3b24bf5bff0eb885f1c3d274637 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi horiguchi.kyot...@lab.ntt.co.jp
Date: Fri, 5 Sep 2014 17:21:48 +0900
Subject: [PATCH] Simplly cutting off the socket if signalled during sending to client.

---
 src/backend/libpq/be-secure.c |   21 ++--
 src/backend/libpq/pqcomm.c|   13 +++
 src/backend/tcop/postgres.c   |   71 ++---
 src/include/libpq/libpq.h |1 +
 4 files changed, 70 insertions(+), 36 deletions(-)

diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index 41ec1ad..3006697 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -145,11 +145,11 @@ secure_raw_read(Port *port, void *ptr, size_t len)
 {
 	ssize_t		n;
 
-	prepare_for_client_read();
+	prepare_for_client_comm();
 
 	n = recv(port-sock, ptr, len, 0);
 
-	client_read_ended();
+	client_comm_ended();
 
 	return n;
 }
@@ -178,5 +178,20 @@ secure_write(Port *port, void *ptr, size_t len)
 ssize_t
 secure_raw_write(Port *port, const void *ptr, size_t len)
 {
-	return send(port-sock, ptr, len, 0);
+	ssize_t		n;
+
+	/*
+	 * If we get interrupted during send under execution without blocking,
+	 * processing interrupt immediately actually throws away the chance to
+	 * complete sending the bytes handed, but the chance which we could send
+	 * one more tuple or maybe the final bytes has less not significance than
+	 * the risk that we might can't bail out forever due to blocking send.
+	 */
+	prepare_for_client_comm();
+
+	n = send(port-sock, ptr, len, 0);
+
+	client_comm_ended();
+
+	return n;
 }
diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c
index 605d891..9b08529 100644
--- a/src/backend/libpq/pqcomm.c
+++ b/src/backend/libpq/pqcomm.c
@@ -1343,6 +1343,19 @@ pq_is_send_pending(void)
 }
 
 /* 
+ *		pq_is_busy	- is there any I/O command running?
+ *
+ *		This function is intended for use within signal handlers to check if
+ * 		any pqcomm I/O operation is under execution.
+ * 
+ */
+bool
+pq_is_busy(void)
+{
+	return PqCommBusy;
+}
+
+/* 
  * Message-level I/O routines begin here.
  *
  * These routines understand about the old-style COPY OUT protocol.
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 7b5480f..b29b200 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -303,16 +303,16 @@ InteractiveBackend(StringInfo inBuf)
  *
  * Even though we are not reading from a client process, we still want to
  * respond to signals, particularly SIGTERM/SIGQUIT.  Hence we must use
- * prepare_for_client_read and client_read_ended.
+ * prepare_for_client_comm and client_comm_ended.
  */
 static int
 interactive_getc(void)
 {
 	int			c;
 
-	prepare_for_client_read();
+	prepare_for_client_comm();
 	c = getc(stdin);
-	client_read_ended();
+	client_comm_ended();
 	return c;
 }
 
@@ -487,53 +487,47 @@ ReadCommand(StringInfo inBuf)
 }
 
 /*
- * prepare_for_client_read -- set up to possibly block on client input
+ * prepare_for_client_comm -- set up to possibly block on client communication
  *
- * This must be called immediately before any low-level read from the
- * client connection.  It is necessary to do it at a sufficiently low level
- * that there won't be any other operations except the read kernel call
- * itself between this call and the subsequent client_read_ended() call.
+ * This must be called immediately before any low-level read from or write to
+ * the client connection.  It is necessary to do it at a sufficiently low
+ * level that there won't be any other operations except the read/write kernel
+ * call itself between this call and the subsequent client_comm_ended() call.
  * In particular there mustn't be use of malloc() or other potentially
- * non-reentrant libc functions.  This restriction makes it safe for us
- * to allow interrupt service routines to execute nontrivial code while
- * we are waiting for input.
+ * non-reentrant libc functions.  This restriction makes it safe for us to
+ * allow interrupt service routines to execute nontrivial code while we are
+ * waiting for input or blocking of output.
  */
 void
-prepare_for_client_read(void)
+prepare_for_client_comm(void)
 {
-	if (DoingCommandRead)
-	{
-		/* Enable immediate processing of asynchronous signals */
-		EnableNotifyInterrupt();
-		EnableCatchupInterrupt();
+	/* Enable immediate processing of asynchronous signals */
+	EnableNotifyInterrupt();
+	EnableCatchupInterrupt();
 
-		/* Allow cancel/die interrupts to be processed while waiting */
-		ImmediateInterruptOK = true;
+	/* Allow cancel/die interrupts to be processed while waiting */
+	ImmediateInterruptOK = true;
 
-		/*

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-09 Thread Kyotaro HORIGUCHI

Hmm. Sorry, I misunderstood the specification.

 You approach that coloring tokens seems right, but you have
 broken the parse logic by adding your code.
 
 Other than the mistakes others pointed, I found that
 
 - non-SQL-ident like tokens are ignored by their token style,
   quoted or not, so the following line works.
 
 | local All aLL trust
 
 I suppose this is not what you intended. This is because you have
 igonred the attribute of a token when comparing it as
 non-SQL-ident tokens.
 
 
 - '+' at the head of the sequence '+' is treated as the first
   character of the *quoted* string. e.g. +hoge is tokenized as
   +hoge:special_quoted.

I found this is what intended. This should be documented as
comments.

|2) users and user-groups only requires special handling and behavior as follows
|Normal user :
|  A. unquoted ( USER ) will be treated as user ( downcase ).
|  B. quoted  ( USeR )  will be treated as USeR (case-sensitive).
|  C. quoted ( +USER ) will be treated as normal user +USER (i.e. will 
not be considered as user-group) and case-sensitive as string is quoted.

This seems confising with the B below. This seems should be
rearranged.

|   User Group :
|  A. unquoted ( +USERGROUP ) will be treated as +usergruop ( downcase ).
|  B. plus quoted ( +UserGROUP  ) will be treated as +UserGROUP 
(case-sensitive).



 This is why you simply continued processing for '+' without
 discarding and skipping the '+', and not setting in_quote so the
 following parser code works as it is not intended. You should
 understand what the original code does and insert or modify
 logics not braeking the assumptions.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-05 Thread Kyotaro HORIGUCHI

Hello,

  - This patch introduced redundant socket emulation for win32
 backend but win32 bare socket for Port is already nonblocking
 as described so it donsn't seem to be a serious problem on
 performance. Addition to it, since I don't know the reason why
 win32/socket.c provides the blocking-mode socket emulation, I
 decided to preserve win32/socket.c to have blocking socket
 emulation. Possibly it can be removed.
 
 On Windows, the backend has an emulation layer for POSIX signals,
 which uses threads and Windows events. The reason win32/socket.c
 always uses non-blocking mode internally is that it needs to wait for
 the socket to become readable/writeable, and for the signal-emulation
 event, at the same time. So no, we can't remove it.

I see, thank you.

 The approach taken in the first patch seems sensible. I changed it to
 not use FD_SET, though. A custom array seems better, that way we don't
 need the pgwin32_nonblockset_init() call, we can just use initialize
 the variable. It's a little bit more code, but it's well-contained in
 win32/socket.c. Please take a look, to double-check that I didn't
 screw up.

Thank you. I felt a bit qualm to abusing fd_set. A bit more code
is not a problem.

I had close look on your patch.

Both 'nonblocking' and 'noblock' are appears in function names,
pgwin32_set_socket_block/noblock/is_nonblocking(). I prefer
nonblocking/blocking pair but I'm satisfied they are in uniform
style anyway. (Though I also didn't so ;p)

pgwin32_set_socket_block() leaves garbage in
nonblocking_sockets[] but it's no problem practically. You also
removed blocking'ize(?) code but I agree that it is correct
because fds of nonclosed socket won't be reused anyway.

pg_set_block/noblock() made me laugh. Yes you're correct. Sorry
for the bronken (but workable) code.

After all, the patch looks pretty good.
I'll continue to fit the another patch onto this.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-05 Thread Kyotaro HORIGUCHI

Hello, attached is the new-and-far-simple version of this
patch. It no longer depends on win32 nonblocking patch since the
socket is in blocking mode again.

 On 08/28/2014 03:47 PM, Kyotaro HORIGUCHI wrote:
  - Preventing protocol violation.
 
 To prevent protocol violation, secure_write sets
 ClientConnectionLost when SIGTERM detected, then
 internal_flush() and ProcessInterrupts() follow the
 instruction.
 
 Oh, hang on. Now that I look at pqcomm.c more closely, it already has
 a mechanism to avoid writing a message in the middle of another
 message. See pq_putmessage and PqCommBusy. Can we rely on that?

Hmm, it gracefully returns up to ExecProcNode() and PqCommBusy is
turned off on the way at pq_putmessage() under current
implement. So PqCommBusy is already false before it runs into
ProcessInterrupts().

Allowing ImmediateInterruptOK on signalled during send(), setting
whereToSendOutput to DestNone if PqCommBusy is true will do. We
can also distinguish read and write by looking
DoingCommandRead. The ImmediateInterruptOK section can be defined
enclosing by prepare_for_client_read/client_read_end.

  - Single pg_terminate_backend surely kills the backend.
 
 secure_raw_write() uses non-blocking socket and a loop of
 select() with timeout to surely detects received
 signal(SIGTERM).
 
 I was going to suggest using WaitLatchOrSocket instead of sleeping in
 1 second increment, but I see that WaitLatchOrSocket() doesn't
 currently support waiting for a socket to become writeable, without
 also waiting for it to become readable. I wonder how difficult it
 would be to lift that restriction.

It seems quite difficult hearing the following discussion.

 I also wonder if it would be simpler to keep the socket in blocking
 mode after all, and just close() in the signal handler if PqCommBusy
 == true. If the signal arrives while sleeping in send(), the effect
 would be the same as with your patch. If the signal arrives while
 sending, but not sleeping, you would not get the chance to send the
 already-buffered data to the client. But maybe that's OK, whether or
 not you're blocked is not very deterministic anyway.

Hmm. We're back round to the my first patch, with immediately
close the socket, and became irrelevant to win32 layer
patch. Anyway, it sounds reasonable.

Attached patch is a quick hack patch, but it seems working as
expected at a glance.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From 7fcb6ef2e66231605e49bd51cd09d275b40cfd57 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi horiguchi.kyot...@lab.ntt.co.jp
Date: Fri, 5 Sep 2014 17:21:48 +0900
Subject: [PATCH] Simplly cutting off the socket if signalled during sending to client.

---
 src/backend/libpq/be-secure.c |   14 +++---
 src/backend/libpq/pqcomm.c|6 ++
 src/backend/tcop/postgres.c   |   40 +---
 src/include/libpq/libpq.h |1 +
 4 files changed, 39 insertions(+), 22 deletions(-)

diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index 41ec1ad..329812b 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -145,11 +145,11 @@ secure_raw_read(Port *port, void *ptr, size_t len)
 {
 	ssize_t		n;
 
-	prepare_for_client_read();
+	prepare_for_client_comm();
 
 	n = recv(port-sock, ptr, len, 0);
 
-	client_read_ended();
+	client_comm_ended();
 
 	return n;
 }
@@ -178,5 +178,13 @@ secure_write(Port *port, void *ptr, size_t len)
 ssize_t
 secure_raw_write(Port *port, const void *ptr, size_t len)
 {
-	return send(port-sock, ptr, len, 0);
+	ssize_t		n;
+
+	prepare_for_client_comm();
+
+	n = send(port-sock, ptr, len, 0);
+
+	client_comm_ended();
+
+	return n;
 }
diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c
index 605d891..8f84f67 100644
--- a/src/backend/libpq/pqcomm.c
+++ b/src/backend/libpq/pqcomm.c
@@ -1342,6 +1342,12 @@ pq_is_send_pending(void)
 	return (PqSendStart  PqSendPointer);
 }
 
+bool
+pq_is_busy(void)
+{
+	return PqCommBusy;
+}
+
 /* 
  * Message-level I/O routines begin here.
  *
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 7b5480f..7a4c483 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -303,16 +303,16 @@ InteractiveBackend(StringInfo inBuf)
  *
  * Even though we are not reading from a client process, we still want to
  * respond to signals, particularly SIGTERM/SIGQUIT.  Hence we must use
- * prepare_for_client_read and client_read_ended.
+ * prepare_for_client_comm and client_comm_ended.
  */
 static int
 interactive_getc(void)
 {
 	int			c;
 
-	prepare_for_client_read();
+	prepare_for_client_comm();
 	c = getc(stdin);
-	client_read_ended();
+	client_comm_ended();
 	return c;
 }
 
@@ -487,7 +487,7 @@ ReadCommand(StringInfo inBuf)
 }
 
 /*
- * prepare_for_client_read -- set up to possibly block on client input
+ * prepare_for_client_comm -- set up to possibly block

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-05 Thread Kyotaro HORIGUCHI

Sorry, It tha patch contains a silly bug. Please find the
attatched one.

 Hello, attached is the new-and-far-simple version of this
 patch. It no longer depends on win32 nonblocking patch since the
 socket is in blocking mode again.
 
  On 08/28/2014 03:47 PM, Kyotaro HORIGUCHI wrote:
   - Preventing protocol violation.
  
  To prevent protocol violation, secure_write sets
  ClientConnectionLost when SIGTERM detected, then
  internal_flush() and ProcessInterrupts() follow the
  instruction.
  
  Oh, hang on. Now that I look at pqcomm.c more closely, it already has
  a mechanism to avoid writing a message in the middle of another
  message. See pq_putmessage and PqCommBusy. Can we rely on that?
 
 Hmm, it gracefully returns up to ExecProcNode() and PqCommBusy is
 turned off on the way at pq_putmessage() under current
 implement. So PqCommBusy is already false before it runs into
 ProcessInterrupts().
 
 Allowing ImmediateInterruptOK on signalled during send(), setting
 whereToSendOutput to DestNone if PqCommBusy is true will do. We
 can also distinguish read and write by looking
 DoingCommandRead. The ImmediateInterruptOK section can be defined
 enclosing by prepare_for_client_read/client_read_end.
 
   - Single pg_terminate_backend surely kills the backend.
  
  secure_raw_write() uses non-blocking socket and a loop of
  select() with timeout to surely detects received
  signal(SIGTERM).
  
  I was going to suggest using WaitLatchOrSocket instead of sleeping in
  1 second increment, but I see that WaitLatchOrSocket() doesn't
  currently support waiting for a socket to become writeable, without
  also waiting for it to become readable. I wonder how difficult it
  would be to lift that restriction.
 
 It seems quite difficult hearing the following discussion.
 
  I also wonder if it would be simpler to keep the socket in blocking
  mode after all, and just close() in the signal handler if PqCommBusy
  == true. If the signal arrives while sleeping in send(), the effect
  would be the same as with your patch. If the signal arrives while
  sending, but not sleeping, you would not get the chance to send the
  already-buffered data to the client. But maybe that's OK, whether or
  not you're blocked is not very deterministic anyway.
 
 Hmm. We're back round to the my first patch, with immediately
 close the socket, and became irrelevant to win32 layer
 patch. Anyway, it sounds reasonable.
 
 Attached patch is a quick hack patch, but it seems working as
 expected at a glance.
From 11da4bc3c214490671d27379910a667f06cc35af Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi horiguchi.kyot...@lab.ntt.co.jp
Date: Fri, 5 Sep 2014 17:21:48 +0900
Subject: [PATCH] Simplly cutting off the socket if signalled during sending to client.

---
 src/backend/libpq/be-secure.c |   14 --
 src/backend/libpq/pqcomm.c|6 
 src/backend/tcop/postgres.c   |   53 -
 src/include/libpq/libpq.h |1 +
 4 files changed, 44 insertions(+), 30 deletions(-)

diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index 41ec1ad..329812b 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -145,11 +145,11 @@ secure_raw_read(Port *port, void *ptr, size_t len)
 {
 	ssize_t		n;
 
-	prepare_for_client_read();
+	prepare_for_client_comm();
 
 	n = recv(port-sock, ptr, len, 0);
 
-	client_read_ended();
+	client_comm_ended();
 
 	return n;
 }
@@ -178,5 +178,13 @@ secure_write(Port *port, void *ptr, size_t len)
 ssize_t
 secure_raw_write(Port *port, const void *ptr, size_t len)
 {
-	return send(port-sock, ptr, len, 0);
+	ssize_t		n;
+
+	prepare_for_client_comm();
+
+	n = send(port-sock, ptr, len, 0);
+
+	client_comm_ended();
+
+	return n;
 }
diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c
index 605d891..8f84f67 100644
--- a/src/backend/libpq/pqcomm.c
+++ b/src/backend/libpq/pqcomm.c
@@ -1342,6 +1342,12 @@ pq_is_send_pending(void)
 	return (PqSendStart  PqSendPointer);
 }
 
+bool
+pq_is_busy(void)
+{
+	return PqCommBusy;
+}
+
 /* 
  * Message-level I/O routines begin here.
  *
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 7b5480f..15627c3 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -303,16 +303,16 @@ InteractiveBackend(StringInfo inBuf)
  *
  * Even though we are not reading from a client process, we still want to
  * respond to signals, particularly SIGTERM/SIGQUIT.  Hence we must use
- * prepare_for_client_read and client_read_ended.
+ * prepare_for_client_comm and client_comm_ended.
  */
 static int
 interactive_getc(void)
 {
 	int			c;
 
-	prepare_for_client_read();
+	prepare_for_client_comm();
 	c = getc(stdin);
-	client_read_ended();
+	client_comm_ended();
 	return c;
 }
 
@@ -487,7 +487,7 @@ ReadCommand(StringInfo inBuf)
 }
 
 /*
- * prepare_for_client_read -- set up to possibly block on client

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-04 Thread Robert Haas

On Tue, Sep 2, 2014 at 3:01 PM, Andres Freund and...@2ndquadrant.com wrote:
 I'm slightly worried about the added overhead due to the latch code. In
 my implementation I only use latches after a nonblocking read, but
 still. Every WaitLatchOrSocket() does a drainSelfPipe(). I wonder if
 that can be made problematic.

I think that's not the word you're looking for.  Or if it is, then -
it's already problematic.  At some point I hacked up a very crude
prototype that made LWLocks use latches to sleep instead of
semaphores.  It was slow.

AIUI, the only reason why we need the self-pipe thing is because on
some platforms signals don't interrupt system calls.  But my
impression was that those platforms were somewhat obscure.  Could we
have a separate latch implementation for platforms where we know that
system calls will get interrupted by signals?  Alternatively, should
we consider reimplementing latches using semaphores?  I assume having
the signal handler up the semaphore would allow the attempt to down
the semaphore to succeed on return from the handler, so it would
accomplish the same thing as the self-pipe trick.

Basically, it doesn't feel like a good thing that we've got two sets
of primitives for making a backend wait that (1) don't really know
about each other and (2) use different operating system primitives.
Presumably one of the two systems is better; let's figure out which
one it is, use that one all the time, and get rid of the other one.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-04 Thread Heikki Linnakangas


On 09/04/2014 03:49 PM, Robert Haas wrote:

On Tue, Sep 2, 2014 at 3:01 PM, Andres Freund and...@2ndquadrant.com wrote:

I'm slightly worried about the added overhead due to the latch code. In
my implementation I only use latches after a nonblocking read, but
still. Every WaitLatchOrSocket() does a drainSelfPipe(). I wonder if
that can be made problematic.


I think that's not the word you're looking for.  Or if it is, then -
it's already problematic.  At some point I hacked up a very crude
prototype that made LWLocks use latches to sleep instead of
semaphores.  It was slow.


Hmm. Perhaps we should call drainSelfPipe() only after poll/select 
returns saying that there is something in the self-pipe. That would be a 
win assuming it's more common for the self-pipe to be empty.



AIUI, the only reason why we need the self-pipe thing is because on
some platforms signals don't interrupt system calls.


That's not the only reason. It also eliminates the race condition that 
someone might set the latch after we've checked that it's not set, but 
before calling poll/select. The same reason that ppoll and pselect exist.



But my
impression was that those platforms were somewhat obscure.  Could we
have a separate latch implementation for platforms where we know that
system calls will get interrupted by signals?


... and have ppoll or pselect. Yeah, seems reasonable, assuming that 
ppoll/pselect is faster.



Alternatively, should
we consider reimplementing latches using semaphores?  I assume having
the signal handler up the semaphore would allow the attempt to down
the semaphore to succeed on return from the handler, so it would
accomplish the same thing as the self-pipe trick.


I don't think there's a function to wait for a file descriptor or 
semaphore at the same time.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-04 Thread Robert Haas

On Thu, Sep 4, 2014 at 9:05 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 Hmm. Perhaps we should call drainSelfPipe() only after poll/select returns
 saying that there is something in the self-pipe. That would be a win
 assuming it's more common for the self-pipe to be empty.

Couldn't hurt.

 But my
 impression was that those platforms were somewhat obscure.  Could we
 have a separate latch implementation for platforms where we know that
 system calls will get interrupted by signals?

 ... and have ppoll or pselect. Yeah, seems reasonable, assuming that
 ppoll/pselect is faster.

Hrm.  So we'd have to block SIGUSR1, check the flag, then use
pselect() to temporarily unblock SIGUSR1 and wait, then on return
again unblock SIGUSR1?  Doesn't seem very appealing.  I think changing
the signal mask is fast on Linux, but quite slow on at least some
other UNIX-like platforms.  And I've heard that pselect() isn't always
truly atomic, so we might run into platform-specific bugs, too.  I
wonder if there's a better way e.g. using memory barriers.

WaitLatch: check is_set.  if yes then done.  otherwise, set signal_me.
memory barrier.  recheck is_set.  if not set then wait using
poll/select. memory barrier.  clear signal_me.
SetLatch: check is_set.  if yes then done.  otherwise, set is_set.
memory barrier.  check signal_me.  if set, then send SIGUSR1.

 Alternatively, should
 we consider reimplementing latches using semaphores?  I assume having
 the signal handler up the semaphore would allow the attempt to down
 the semaphore to succeed on return from the handler, so it would
 accomplish the same thing as the self-pipe trick.

 I don't think there's a function to wait for a file descriptor or semaphore
 at the same time.

Oh, good point.  So that's out, then.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-04 Thread Heikki Linnakangas


On 09/04/2014 04:37 PM, Robert Haas wrote:

Hrm.  So we'd have to block SIGUSR1, check the flag, then use
pselect() to temporarily unblock SIGUSR1 and wait, then on return
again unblock SIGUSR1?  Doesn't seem very appealing.  I think changing
the signal mask is fast on Linux, but quite slow on at least some
other UNIX-like platforms.  And I've heard that pselect() isn't always
truly atomic, so we might run into platform-specific bugs, too.  I
wonder if there's a better way e.g. using memory barriers.

WaitLatch: check is_set.  if yes then done.  otherwise, set signal_me.
memory barrier.  recheck is_set.  if not set then wait using
poll/select. memory barrier.  clear signal_me.
SetLatch: check is_set.  if yes then done.  otherwise, set is_set.
memory barrier.  check signal_me.  if set, then send SIGUSR1.


Doesn't work. No matter what you do, the process running WaitLatch might 
receive the signal immediately before it calls poll/select. The signal 
handler will run, and the poll/select call will then go to sleep. There 
is no way to do this without support from the kernel, that is why 
ppoll/pselect exist.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-04 Thread Robert Haas

On Thu, Sep 4, 2014 at 9:53 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 On 09/04/2014 04:37 PM, Robert Haas wrote:
 Hrm.  So we'd have to block SIGUSR1, check the flag, then use
 pselect() to temporarily unblock SIGUSR1 and wait, then on return
 again unblock SIGUSR1?  Doesn't seem very appealing.  I think changing
 the signal mask is fast on Linux, but quite slow on at least some
 other UNIX-like platforms.  And I've heard that pselect() isn't always
 truly atomic, so we might run into platform-specific bugs, too.  I
 wonder if there's a better way e.g. using memory barriers.

 WaitLatch: check is_set.  if yes then done.  otherwise, set signal_me.
 memory barrier.  recheck is_set.  if not set then wait using
 poll/select. memory barrier.  clear signal_me.
 SetLatch: check is_set.  if yes then done.  otherwise, set is_set.
 memory barrier.  check signal_me.  if set, then send SIGUSR1.

 Doesn't work. No matter what you do, the process running WaitLatch might
 receive the signal immediately before it calls poll/select. The signal
 handler will run, and the poll/select call will then go to sleep. There is
 no way to do this without support from the kernel, that is why ppoll/pselect
 exist.

Eesh, I was confused there: ignore me.  I was trying to optimize away
the signal handling but assuming we still had the self-pipe byte.  But
of course in that case we don't need to change anything at all.

I'm going to go get some more caffeine.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-03 Thread Heikki Linnakangas


On 09/03/2014 12:23 AM, Andres Freund wrote:

On 2014-09-02 17:21:03 -0400, Tom Lane wrote:

Heikki Linnakangas hlinnakan...@vmware.com writes:

I was going to suggest using WaitLatchOrSocket instead of sleeping in 1
second increment, but I see that WaitLatchOrSocket() doesn't currently
support waiting for a socket to become writeable, without also waiting
for it to become readable. I wonder how difficult it would be to lift
that restriction.


My recollection is that there was a reason for that, but I don't recall
details any more.


http://git.postgresql.org/pg/commitdiff/e42a21b9e6c9b9e6346a34b62628d48ff2fc6ddf

In my prototype I've changed the API that errors set both
READABLE/WRITABLE. Seems to work


Andres, would you mind posting the WIP patch you have? That could be a 
better foundation for this patch.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-02 Thread Heikki Linnakangas


On 08/28/2014 03:47 PM, Kyotaro HORIGUCHI wrote:

   To make the code mentioned above (Patch 0002) tidy, rewrite the
   socket emulation code for win32 backends so that each socket
   can have its own non-blocking state. (patch 0001)


The first patch that makes non-blocking sockets behave more sanely on 
Windows seems like a good idea, independently of the second patch. I'm 
looking at the first patch now, I'll make a separate post about the 
second patch.



Some concern about this patch,

- This patch allows the number of non-blocking socket to be below
   64 (FD_SETSIZE) on win32 backend but it seems to be sufficient.


Yeah, that's plenty.


- This patch introduced redundant socket emulation for win32
   backend but win32 bare socket for Port is already nonblocking
   as described so it donsn't seem to be a serious problem on
   performance. Addition to it, since I don't know the reason why
   win32/socket.c provides the blocking-mode socket emulation, I
   decided to preserve win32/socket.c to have blocking socket
   emulation. Possibly it can be removed.


On Windows, the backend has an emulation layer for POSIX signals, which 
uses threads and Windows events. The reason win32/socket.c always uses 
non-blocking mode internally is that it needs to wait for the socket to 
become readable/writeable, and for the signal-emulation event, at the 
same time. So no, we can't remove it.


The approach taken in the first patch seems sensible. I changed it to 
not use FD_SET, though. A custom array seems better, that way we don't 
need the pgwin32_nonblockset_init() call, we can just use initialize the 
variable. It's a little bit more code, but it's well-contained in 
win32/socket.c. Please take a look, to double-check that I didn't screw up.


- Heikki

commit aaaec23b08677baaed900f72db3f9628c0070922
Author: Heikki Linnakangas heikki.linnakan...@iki.fi
Date:   Tue Sep 2 20:05:47 2014 +0300

Improve non-blocking sockets emulation on Windows.

diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c
index 605d891..cba79a7 100644
--- a/src/backend/libpq/pqcomm.c
+++ b/src/backend/libpq/pqcomm.c
@@ -795,10 +795,6 @@ pq_set_nonblocking(bool nonblocking)
 	if (MyProcPort-noblock == nonblocking)
 		return;
 
-#ifdef WIN32
-	pgwin32_noblock = nonblocking ? 1 : 0;
-#else
-
 	/*
 	 * Use COMMERROR on failure, because ERROR would try to send the error to
 	 * the client, which might require changing the mode again, leading to
@@ -816,7 +812,6 @@ pq_set_nonblocking(bool nonblocking)
 			ereport(COMMERROR,
 	(errmsg(could not set socket to blocking mode: %m)));
 	}
-#endif
 	MyProcPort-noblock = nonblocking;
 }
 
diff --git a/src/backend/port/win32/socket.c b/src/backend/port/win32/socket.c
index c981169..51982f0 100644
--- a/src/backend/port/win32/socket.c
+++ b/src/backend/port/win32/socket.c
@@ -3,6 +3,9 @@
  * socket.c
  *	  Microsoft Windows Win32 Socket Functions
  *
+ * Blocking socket functions implemented so they listen on both the socket
+ * and the signal event, required for signal handling.
+ *
  * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
  *
  * IDENTIFICATION
@@ -14,18 +17,19 @@
 #include postgres.h
 
 /*
- * Indicate if pgwin32_recv() and pgwin32_send() should operate
- * in non-blocking mode.
- *
- * Since the socket emulation layer always sets the actual socket to
- * non-blocking mode in order to be able to deliver signals, we must
- * specify this in a separate flag if we actually need non-blocking
- * operation.
- *
- * This flag changes the behaviour *globally* for all socket operations,
- * so it should only be set for very short periods of time.
+ * Keep track which sockets are in non-blocking mode. Since the socket
+ * emulation layer always sets the actual socket to non-blocking mode in
+ * order to be able to deliver signals, we must track ourselves whether to
+ * present blocking or non-blocking behavior to the callers of pgwin32_recv()
+ * and pgwin32_send(). We expect there to be only a few non-blocking sockets,
+ * so a small array will do.
  */
-int			pgwin32_noblock = 0;
+#define MAX_NONBLOCKING_SOCKETS	10
+
+static SOCKET		nonblocking_sockets[MAX_NONBLOCKING_SOCKETS];
+static int			num_nonblocking_sockets = 0;
+
+static bool pgwin32_socket_is_nonblocking(SOCKET s);
 
 #undef socket
 #undef accept
@@ -33,11 +37,57 @@ int			pgwin32_noblock = 0;
 #undef select
 #undef recv
 #undef send
+#undef closesocket
 
 /*
- * Blocking socket functions implemented so they listen on both
- * the socket and the signal event, required for signal handling.
+ * Add a socket to the list of non-blocking sockets.
  */
+void
+pgwin32_set_socket_noblock(SOCKET s)
+{
+	if (pgwin32_socket_is_nonblocking(s))
+		return;	/* already non-nlocking */
+
+	if (num_nonblocking_sockets = MAX_NONBLOCKING_SOCKETS)
+		elog(ERROR, too many non-blocking sockets);
+
+	nonblocking_sockets[num_nonblocking_sockets++] = s;
+}
+
+/*
+ * Remove a socket from the list of non-blocking

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-02 Thread Heikki Linnakangas


On 08/28/2014 03:47 PM, Kyotaro HORIGUCHI wrote:

- Preventing protocol violation.

   To prevent protocol violation, secure_write sets
   ClientConnectionLost when SIGTERM detected, then
   internal_flush() and ProcessInterrupts() follow the
   instruction.


Oh, hang on. Now that I look at pqcomm.c more closely, it already has a 
mechanism to avoid writing a message in the middle of another message. 
See pq_putmessage and PqCommBusy. Can we rely on that?



- Single pg_terminate_backend surely kills the backend.

   secure_raw_write() uses non-blocking socket and a loop of
   select() with timeout to surely detects received
   signal(SIGTERM).


I was going to suggest using WaitLatchOrSocket instead of sleeping in 1 
second increment, but I see that WaitLatchOrSocket() doesn't currently 
support waiting for a socket to become writeable, without also waiting 
for it to become readable. I wonder how difficult it would be to lift 
that restriction.


I also wonder if it would be simpler to keep the socket in blocking mode 
after all, and just close() in the signal handler if PqCommBusy == true. 
If the signal arrives while sleeping in send(), the effect would be the 
same as with your patch. If the signal arrives while sending, but not 
sleeping, you would not get the chance to send the already-buffered data 
to the client. But maybe that's OK, whether or not you're blocked is not 
very deterministic anyway.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-02 Thread Andres Freund

On 2014-09-02 21:46:29 +0300, Heikki Linnakangas wrote:
 I was going to suggest using WaitLatchOrSocket instead of sleeping in 1
 second increment, but I see that WaitLatchOrSocket() doesn't currently
 support waiting for a socket to become writeable, without also waiting for
 it to become readable. I wonder how difficult it would be to lift that
 restriction.

It's imo not that difficult. I've a prototype patch for that
somewhere. I tested my poll() implementation and it worked, but didn't
yet get to select().

 I also wonder if it would be simpler to keep the socket in blocking mode
 after all, and just close() in the signal handler if PqCommBusy == true. If
 the signal arrives while sleeping in send(), the effect would be the same as
 with your patch. If the signal arrives while sending, but not sleeping, you
 would not get the chance to send the already-buffered data to the client.
 But maybe that's OK, whether or not you're blocked is not very deterministic
 anyway.

I've actually been working on a patch to make the whole interaction with
the client using sockets. The reason I started so is that we lots of far
to complex stuff in signal handlers, and using a latch would allow us to
instead interrupt send/recv. While still heavily WIP the reduction in
odd stuff (check e.g. HandleCatchupInterrupt()) made me rather happy.

I'm slightly worried about the added overhead due to the latch code. In
my implementation I only use latches after a nonblocking read, but
still. Every WaitLatchOrSocket() does a drainSelfPipe(). I wonder if
that can be made problematic.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-02 Thread Andres Freund

On 2014-09-02 21:01:44 +0200, Andres Freund wrote:
 I've actually been working on a patch to make the whole interaction with
 the client using sockets. The reason I started so is that we lots of far
 to complex stuff in signal handlers, and using a latch would allow us to
 instead interrupt send/recv. While still heavily WIP the reduction in
 odd stuff (check e.g. HandleCatchupInterrupt()) made me rather happy.

Actually, the even more important reason is that that would allow us to
throw errors/fatals sanely in interrupts because we wouldn't possibly
jump through openssl anymore...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-02 Thread Tom Lane

Heikki Linnakangas hlinnakan...@vmware.com writes:
 I was going to suggest using WaitLatchOrSocket instead of sleeping in 1 
 second increment, but I see that WaitLatchOrSocket() doesn't currently 
 support waiting for a socket to become writeable, without also waiting 
 for it to become readable. I wonder how difficult it would be to lift 
 that restriction.

My recollection is that there was a reason for that, but I don't recall
details any more.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-02 Thread Andres Freund

On 2014-09-02 17:21:03 -0400, Tom Lane wrote:
 Heikki Linnakangas hlinnakan...@vmware.com writes:
  I was going to suggest using WaitLatchOrSocket instead of sleeping in 1 
  second increment, but I see that WaitLatchOrSocket() doesn't currently 
  support waiting for a socket to become writeable, without also waiting 
  for it to become readable. I wonder how difficult it would be to lift 
  that restriction.
 
 My recollection is that there was a reason for that, but I don't recall
 details any more.

http://git.postgresql.org/pg/commitdiff/e42a21b9e6c9b9e6346a34b62628d48ff2fc6ddf

In my prototype I've changed the API that errors set both
READABLE/WRITABLE. Seems to work

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-08-28 Thread Kyotaro HORIGUCHI

Hello, sorry for the dazed reply in the previous mail.

I made revised patch for this issue.

Attached patches are following,

- 0001_Revise_socket_emulation_for_win32_backend.patch

  Revises socket emulation on win32 backend so that each socket
  can have its own blocking mode state.

- 0002_Allow_backend_termination_during_write_blocking.patch

  The patch to solve the issue. This patch depends on the 0001_
  patch.

==

  I'm marking this as Waiting on Author in the commitfest app, because:
  1. the protocol violation needs to be avoided one way or another, and
  2. the behavior needs to be consistent so that a single
  pg_terminate_backend() is enough to always kill the connection.

- Preventing protocol violation.

  To prevent protocol violation, secure_write sets
  ClientConnectionLost when SIGTERM detected, then
  internal_flush() and ProcessInterrupts() follow the
  instruction.

- Single pg_terminate_backend surely kills the backend.

  secure_raw_write() uses non-blocking socket and a loop of
  select() with timeout to surely detects received
  signal(SIGTERM).

  To avoid frequent switching of blocking mode, the bare socket
  for Port is put to non-blocking mode from the first in
  StreamConnection() and blocking mode is controlled only by
  Port-noblock in secure_raw_read/write().

  To make the code mentioned above (Patch 0002) tidy, rewrite the
  socket emulation code for win32 backends so that each socket
  can have its own non-blocking state. (patch 0001)

Some concern about this patch,

- This patch allows the number of non-blocking socket to be below
  64 (FD_SETSIZE) on win32 backend but it seems to be sufficient.

- This patch introduced redundant socket emulation for win32
  backend but win32 bare socket for Port is already nonblocking
  as described so it donsn't seem to be a serious problem on
  performance. Addition to it, since I don't know the reason why
  win32/socket.c provides the blocking-mode socket emulation, I
  decided to preserve win32/socket.c to have blocking socket
  emulation. Possibly it can be removed.

Any suggestions?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c
index 605d891..c92851e 100644
--- a/src/backend/libpq/pqcomm.c
+++ b/src/backend/libpq/pqcomm.c
@@ -795,10 +795,6 @@ pq_set_nonblocking(bool nonblocking)
 	if (MyProcPort-noblock == nonblocking)
 		return;
 
-#ifdef WIN32
-	pgwin32_noblock = nonblocking ? 1 : 0;
-#else
-
 	/*
 	 * Use COMMERROR on failure, because ERROR would try to send the error to
 	 * the client, which might require changing the mode again, leading to
@@ -816,7 +812,7 @@ pq_set_nonblocking(bool nonblocking)
 			ereport(COMMERROR,
 	(errmsg(could not set socket to blocking mode: %m)));
 	}
-#endif
+
 	MyProcPort-noblock = nonblocking;
 }
 
diff --git a/src/backend/port/win32/socket.c b/src/backend/port/win32/socket.c
index c981169..f0ff3e7 100644
--- a/src/backend/port/win32/socket.c
+++ b/src/backend/port/win32/socket.c
@@ -21,11 +21,8 @@
  * non-blocking mode in order to be able to deliver signals, we must
  * specify this in a separate flag if we actually need non-blocking
  * operation.
- *
- * This flag changes the behaviour *globally* for all socket operations,
- * so it should only be set for very short periods of time.
  */
-int			pgwin32_noblock = 0;
+static fd_set		nonblockset;
 
 #undef socket
 #undef accept
@@ -33,6 +30,7 @@ int			pgwin32_noblock = 0;
 #undef select
 #undef recv
 #undef send
+#undef closesocket
 
 /*
  * Blocking socket functions implemented so they listen on both
@@ -40,6 +38,34 @@ int			pgwin32_noblock = 0;
  */
 
 /*
+ * Set blocking mode for each socket
+ */
+void
+pgwin32_set_socket_nonblock(SOCKET s, int nonblock)
+{
+	if (nonblock)
+		FD_SET(s, nonblockset);
+	else
+		FD_CLR(s, nonblockset);
+
+	/*
+	 * fd_set cannot have more than FD_SETSIZE entries. It's not likey to come
+	 * close to this limit but if it goes above the limit, non blocking state
+	 * of some existing sockets will be discarded.
+	 */
+	if (nonblockset.fd_count = FD_SETSIZE)
+		elog(FATAL, Too many sockets requested to be nonblocking mode.);
+}
+
+void
+pgwin32_nonblockset_init()
+{
+	FD_ZERO(nonblockset);
+}
+
+#define socket_is_nonblocking(s) FD_ISSET((s), nonblockset)
+
+/*
  * Convert the last socket error code into errno
  */
 static void
@@ -256,6 +282,10 @@ pgwin32_socket(int af, int type, int protocol)
 		TranslateSocketError();
 		return INVALID_SOCKET;
 	}
+
+	/* newly cerated socket should be in blocking mode  */
+	pgwin32_set_socket_nonblock(s, false);
+
 	errno = 0;
 
 	return s;
@@ -334,7 +364,7 @@ pgwin32_recv(SOCKET s, char *buf, int len, int f)
 		return -1;
 	}
 
-	if (pgwin32_noblock)
+	if (socket_is_nonblocking(s))
 	{
 		/*
 		 * No data received, and we are in emulated non-blocking mode, so
@@ -420,7 +450,7 @@ pgwin32_send(SOCKET s, const void *buf, int len, int flags)
 			return -1;
 		}
 
-		if

Re: [HACKERS] Escaping from blocked send() reprised.

2014-08-26 Thread Kyotaro HORIGUCHI

Sorry, I was absorbed by other tasks..

Thank you for reviewing thiis.

 On 07/01/2014 06:26 AM, Kyotaro HORIGUCHI wrote:
  At Mon, 30 Jun 2014 11:27:47 -0400, Robert Haas
  robertmh...@gmail.com wrote in
  CA+TgmoZfcGzAEmtbyoCe6VdHnq085x+ox752zuJ2AKN=wc8...@mail.gmail.com
  1. I think it's the case that there are platforms around where a
  signal won't cause send() to return EINTR and I'd be entirely
  unsurprised if SSL_write() doesn't necessarily return EINTR in that
  case.  I'm not sure what, if anything, we can do about that.
 
 We use a custom write routine with SSL_write, where we call send()
 ourselves, so that's not a problem as long as we put the check in the
 right place (in secure_raw_write(), after my recent SSL refactoring -
 the patch needs to be rebased).
 
  man 2 send on FreeBSD has not description about EINTR.. And even
  on linux, send won't return EINTR for most cases, at least I
  haven't seen that. So send()=-1,EINTR seems to me as only an
  equivalent of send() = 0. I have no idea about what the
  implementer thought the difference is.
 
 As the patch stands, there's a race condition: if the SIGTERM arrives
 *before* the send() call, the send() won't return EINTR anyway. So
 there's a chance that you still block. Calling pq_terminate_backend()
 again will dislodge it (assuming send() returns with EINTR on signal),

Yes, that window would'nt be extinguished without introducing
something more. EINTR is set only when nothing sent by the
call. So AFAIS the chance of getting EINTR is far small than
expectation.

 but I don't think we want to define the behavior as usually,
 pq_terminate_backend() will kill a backend that's blocked on sending
 to the client, but sometimes you have to call it twice (or more!) to
 really kill it.

I agree that it is desirable behavior, if any measure to avoid
that. But I think it's better than doing kill -9 engulfing all
innocent backends.

 A more robust way is to set ImmediateInterruptOK before calling
 send(). That wouldn't let you send data that can be sent without
 blocking though. For that, you could put the socket to non-blocking
 mode, and sleep with select(), also waiting for the process' latch at
 the same time (die() sets the latch, so that will wake up the select()
 if a termination request arrives).

I condiered it but select() frequently (rather in most cases when
send() blocks by send buffer exhaustion) fails to predict that
following send() will be blocked. (If my memory is correct.)  So
the final problem would be blocked send()...

 Is it actually safe to process the die-interrupt where send() is
 called? ProcessInterrupts() does ereport(FATAL, ...), which will
 attempt to send a message to the client. If that happens in the middle
 of constructing some other message, that will violate the protocol.

So I strongly agree to you if select() works as the impression
when reading the man document.

  2. I think it would be reasonable to try to kill off the connection
  without notifying the client if we're unable to send the data to the
  client in a reasonable period of time.  But I'm unsure what a
  reasonable period of time means.  This patch would basically do it
  after no delay at all, which seems like it might be too aggressive.
  However, I'm not sure.
 
  I think there's no such a reasonable time.
 
 I agree it's pretty hard to define any reasonable timeout here. I
 think it would be fine to just cut the connection; even if you don't
 block while sending, you'll probably reach a CHECK_FOR_INTERRUPT()
 somewhere higher in the stack and kill the connection almost as
 abruptly anyway. (you can't violate the protocol, however)

Yes, closing the blocked connection seems one of the most smarter
way, checking the occurred interrupt could avoid protocol
violation. But the problem for that is that there seems no means
to close sockets elsewhere the blocking handle. dup(2)'ed handle
cannot release the resource by only itself.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-08-26 Thread Heikki Linnakangas


On 08/26/2014 09:17 AM, Kyotaro HORIGUCHI wrote:

but I don't think we want to define the behavior as usually,
pq_terminate_backend() will kill a backend that's blocked on sending
to the client, but sometimes you have to call it twice (or more!) to
really kill it.


I agree that it is desirable behavior, if any measure to avoid
that. But I think it's better than doing kill -9 engulfing all
innocent backends.


A more robust way is to set ImmediateInterruptOK before calling
send(). That wouldn't let you send data that can be sent without
blocking though. For that, you could put the socket to non-blocking
mode, and sleep with select(), also waiting for the process' latch at
the same time (die() sets the latch, so that will wake up the select()
if a termination request arrives).


I condiered it but select() frequently (rather in most cases when
send() blocks by send buffer exhaustion) fails to predict that
following send() will be blocked. (If my memory is correct.)  So
the final problem would be blocked send()...


My point was to put the socket in non-blocking mode, so that send() will 
return immediately with EAGAIN instead of blocking, if the send buffer 
is full. See WalSndWriteData for how that would work, it does something 
similar.



Is it actually safe to process the die-interrupt where send() is
called? ProcessInterrupts() does ereport(FATAL, ...), which will
attempt to send a message to the client. If that happens in the middle
of constructing some other message, that will violate the protocol.


So I strongly agree to you if select() works as the impression
when reading the man document.


Not sure what you mean, but the above is a fatal problem with the patch 
right now, regardless of how you do the sleeping.



2. I think it would be reasonable to try to kill off the connection
without notifying the client if we're unable to send the data to the
client in a reasonable period of time.  But I'm unsure what a
reasonable period of time means.  This patch would basically do it
after no delay at all, which seems like it might be too aggressive.
However, I'm not sure.


I think there's no such a reasonable time.


I agree it's pretty hard to define any reasonable timeout here. I
think it would be fine to just cut the connection; even if you don't
block while sending, you'll probably reach a CHECK_FOR_INTERRUPT()
somewhere higher in the stack and kill the connection almost as
abruptly anyway. (you can't violate the protocol, however)


Yes, closing the blocked connection seems one of the most smarter
way, checking the occurred interrupt could avoid protocol
violation. But the problem for that is that there seems no means
to close sockets elsewhere the blocking handle. dup(2)'ed handle
cannot release the resource by only itself.


I didn't understand that, surely you can just close() the socket? There 
is no dup(2) involved. And we don't necessarily need to close the 
socket, we just need to avoid writing to it when we're already in the 
middle of sending a message.


I'm marking this as Waiting on Author in the commitfest app, because:
1. the protocol violation needs to be avoided one way or another, and
2. the behavior needs to be consistent so that a single 
pg_terminate_backend() is enough to always kill the connection.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-08-26 Thread Kyotaro HORIGUCHI

Hello,

  I condiered it but select() frequently (rather in most cases when
  send() blocks by send buffer exhaustion) fails to predict that
  following send() will be blocked. (If my memory is correct.)  So
  the final problem would be blocked send()...
 
 My point was to put the socket in non-blocking mode, so that send()
 will return immediately with EAGAIN instead of blocking, if the send
 buffer is full. See WalSndWriteData for how that would work, it does
 something similar.

I confused it with what I did during writing this patch. select()
- blocking send(). Sorry for confusing the discussion. I
understand correctly what you mean and It sounds reasonable.

  I agree it's pretty hard to define any reasonable timeout here. I
  think it would be fine to just cut the connection; even if you don't
  block while sending, you'll probably reach a CHECK_FOR_INTERRUPT()
  somewhere higher in the stack and kill the connection almost as
  abruptly anyway. (you can't violate the protocol, however)
 
  Yes, closing the blocked connection seems one of the most smarter
  way, checking the occurred interrupt could avoid protocol
  violation. But the problem for that is that there seems no means
  to close sockets elsewhere the blocking handle. dup(2)'ed handle
  cannot release the resource by only itself.
 
 I didn't understand that, surely you can just close() the socket?
 There is no dup(2) involved. And we don't necessarily need to close
 the socket, we just need to avoid writing to it when we're already in
 the middle of sending a message.

My assumption there was select() and *blocking* send(). So the
sentence cited is out of point from the first.

 I'm marking this as Waiting on Author in the commitfest app, because:
 1. the protocol violation needs to be avoided one way or another, and
 2. the behavior needs to be consistent so that a single
 pg_terminate_backend() is enough to always kill the connection.

Thank you for the suggestion. I think I can go forward with that
and will come up with new patch.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-08-25 Thread Heikki Linnakangas


On 07/01/2014 06:26 AM, Kyotaro HORIGUCHI wrote:

At Mon, 30 Jun 2014 11:27:47 -0400, Robert Haas robertmh...@gmail.com wrote in 
CA+TgmoZfcGzAEmtbyoCe6VdHnq085x+ox752zuJ2AKN=wc8...@mail.gmail.com

1. I think it's the case that there are platforms around where a
signal won't cause send() to return EINTR and I'd be entirely
unsurprised if SSL_write() doesn't necessarily return EINTR in that
case.  I'm not sure what, if anything, we can do about that.


We use a custom write routine with SSL_write, where we call send() 
ourselves, so that's not a problem as long as we put the check in the 
right place (in secure_raw_write(), after my recent SSL refactoring - 
the patch needs to be rebased).



man 2 send on FreeBSD has not description about EINTR.. And even
on linux, send won't return EINTR for most cases, at least I
haven't seen that. So send()=-1,EINTR seems to me as only an
equivalent of send() = 0. I have no idea about what the
implementer thought the difference is.


As the patch stands, there's a race condition: if the SIGTERM arrives 
*before* the send() call, the send() won't return EINTR anyway. So 
there's a chance that you still block. Calling pq_terminate_backend() 
again will dislodge it (assuming send() returns with EINTR on signal), 
but I don't think we want to define the behavior as usually, 
pq_terminate_backend() will kill a backend that's blocked on sending to 
the client, but sometimes you have to call it twice (or more!) to really 
kill it.


A more robust way is to set ImmediateInterruptOK before calling send(). 
That wouldn't let you send data that can be sent without blocking 
though. For that, you could put the socket to non-blocking mode, and 
sleep with select(), also waiting for the process' latch at the same 
time (die() sets the latch, so that will wake up the select() if a 
termination request arrives).


Is it actually safe to process the die-interrupt where send() is called? 
ProcessInterrupts() does ereport(FATAL, ...), which will attempt to 
send a message to the client. If that happens in the middle of 
constructing some other message, that will violate the protocol.



2. I think it would be reasonable to try to kill off the connection
without notifying the client if we're unable to send the data to the
client in a reasonable period of time.  But I'm unsure what a
reasonable period of time means.  This patch would basically do it
after no delay at all, which seems like it might be too aggressive.
However, I'm not sure.


I think there's no such a reasonable time.


I agree it's pretty hard to define any reasonable timeout here. I think 
it would be fine to just cut the connection; even if you don't block 
while sending, you'll probably reach a CHECK_FOR_INTERRUPT() somewhere 
higher in the stack and kill the connection almost as abruptly anyway. 
(you can't violate the protocol, however)


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-07-04 Thread Kyotaro HORIGUCHI

Hello,

At Tue, 1 Jul 2014 21:21:38 +0200, Martijn van Oosterhout klep...@svana.org 
wrote in 20140701192138.gb20...@svana.org
 On Tue, Jul 01, 2014 at 12:26:43PM +0900, Kyotaro HORIGUCHI wrote:
   1. I think it's the case that there are platforms around where a
   signal won't cause send() to return EINTR and I'd be entirely
   unsurprised if SSL_write() doesn't necessarily return EINTR in that
   case.  I'm not sure what, if anything, we can do about that.
  
  man 2 send on FreeBSD has not description about EINTR.. And even
  on linux, send won't return EINTR for most cases, at least I
  haven't seen that. So send()=-1,EINTR seems to me as only an
  equivalent of send() = 0. I have no idea about what the
  implementer thought the difference is.
 
 Whether send() returns EINTR or not depends on whether the signal has
 been marked restartable or not. This is configurable per signal, see
 sigaction(). If the signal is marked to restart, the kernel returns
 ERESTARTHAND (IIRC) and the libc will redo the call internally.

Wow, thank you for detailed information. I'll study that and take
it into future discussion.

 Default BSD does not return EINTR normally, but supports sigaction().

I guess it is for easiness-to-keep-compatibility, seems
reasonable.


have a nice day,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-07-04 Thread Kyotaro HORIGUCHI

Hello, thank you for keeping this discussion moving.

  I think there's no such a reasonable time. The behavior might
  should be determined from another point.. On alternative would be
  let pg_terminate_backend() have a parameter instructing force
  shutodwn (how to propagate it?), or make a forced shutdown on
  duplicate invocation of pg_terminate_backend().
 
 Well, I think that when people call pg_terminate_backend() just once,
 they expect it to kill the target backend.  I think people will
 tolerate a short delay, like a few seconds; after all, there's no
 guarantee, even today, that the backend will hit a
 CHECK_FOR_INTERRUPTS() in less than a few hundred milliseconds.

Sure.

 But they are not going to want to have to take a second action
 to kill the backend - killing it once should be sufficient.

Hmm, it sounds persuasive. Well, do you think they tolerate
-force option? (Even though its technical practicality is not
clear)

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-07-01 Thread Robert Haas

On Mon, Jun 30, 2014 at 11:26 PM, Kyotaro HORIGUCHI
horiguchi.kyot...@lab.ntt.co.jp wrote:
 2. I think it would be reasonable to try to kill off the connection
 without notifying the client if we're unable to send the data to the
 client in a reasonable period of time.  But I'm unsure what a
 reasonable period of time means.  This patch would basically do it
 after no delay at all, which seems like it might be too aggressive.
 However, I'm not sure.

 I think there's no such a reasonable time. The behavior might
 should be determined from another point.. On alternative would be
 let pg_terminate_backend() have a parameter instructing force
 shutodwn (how to propagate it?), or make a forced shutdown on
 duplicate invocation of pg_terminate_backend().

Well, I think that when people call pg_terminate_backend() just once,
they expect it to kill the target backend.  I think people will
tolerate a short delay, like a few seconds; after all, there's no
guarantee, even today, that the backend will hit a
CHECK_FOR_INTERRUPTS() in less than a few hundred milliseconds.  But
they are not going to want to have to take a second action to kill the
backend - killing it once should be sufficient.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-07-01 Thread Martijn van Oosterhout

On Tue, Jul 01, 2014 at 12:26:43PM +0900, Kyotaro HORIGUCHI wrote:
  1. I think it's the case that there are platforms around where a
  signal won't cause send() to return EINTR and I'd be entirely
  unsurprised if SSL_write() doesn't necessarily return EINTR in that
  case.  I'm not sure what, if anything, we can do about that.
 
 man 2 send on FreeBSD has not description about EINTR.. And even
 on linux, send won't return EINTR for most cases, at least I
 haven't seen that. So send()=-1,EINTR seems to me as only an
 equivalent of send() = 0. I have no idea about what the
 implementer thought the difference is.

Whether send() returns EINTR or not depends on whether the signal has
been marked restartable or not. This is configurable per signal, see
sigaction(). If the signal is marked to restart, the kernel returns
ERESTARTHAND (IIRC) and the libc will redo the call internally.

Default BSD does not return EINTR normally, but supports sigaction().

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature

[HACKERS] Escaping from blocked send() reprised.

2014-06-30 Thread Kyotaro HORIGUCHI

Hello, I have received inquiries related to blocked communication
several times for these weeks with different symptoms. Then I
found this message from archive,

http://postgresql.1045698.n5.nabble.com/Escaping-a-blocked-sendto-syscall-without-causing-a-restart-td5740855.html

Subject: Escaping a blocked sendto() syscall without causing a restart

Mr. Tom Lane gave a comment replying it,

Offhand it looks to me like most signals would kick the backend off the
send() call ... but it would loop right back and try again. See
internal_flush() in pqcomm.c. (If you're using SSL, this diagnosis
may or may not apply.)

We can't do anything except repeat the send attempt if the client
connection is to be kept in a sane state.
(snipped)
And I'm not at all sure if we could get it to work in SSL mode...

That's true for timeouts that should continue the connection,
say, statement_timeout, but focusing on intentional backend
termination, I think it does no harm to break it up abruptly,
even if it was on SSL. On the other hand it seems still
preferable to keep a connection when not blocked. The following
expression would detects such a blocking state at just before
next send(2) after the previous try exited by signals.

(ProcDiePending select(1, NULL, fd, NULL, '1 sec') == 0)

Finally, pg_terminate_backend() works even when send is blocked
for both SSL and non-SSL connections after 1 second delay with
this patch (break_socket_blocking_on_termination_v1.patch).

Nevetheless, of course statement_timeout cannot become effective
by this method since it breaks the consistency in the client
protocol. It needs change in client protocol to have out of
band mechanism or something, maybe.

Any suggestions?

Attached patches are:

- break_socket_blocking_on_termination_v1.patch : The patch to
break blocked state of send(2) for pg_terminate_backend().

- socket_block_test.patch : debug printing and changing send
buffer of libpq for reproducing the blocked situation.

Some point of discussion follows,

Discussion about the appropriateness of looking into
ProcDiePending there and calling CHECK_FOR_INTERRUPTS()
seeing it.

I have somewhat uneasiness of these things, but what we can at
most seems to be replacing ProcDiePending here with some another
variable, say ImmediatelyExitFromBlockedState, and somehow go
upstairs through normal return path. Additional Try-Catch seems
can do that but it looks no benefit for the added complexity..

Discussion on breaking up connetion especially for SSL

Breaking an SSL connection up in my_sock_write() cause the
following message on client side if it still lives and resumes to
receive from the connection, this seems to show that the client
handles the event properly.

| SSL SYSCALL error: EOF detected
| The connection to the server was lost. Attempting reset: Succeeded.

Discussion on reliability of select(2)

This method is not a perfect solution, since the select(2)
sometimes gives a wrong oracle about wheather the follwing
send(2) will be blocked.

Even so, as far as I see, select(2) just after exiting from
blocked send(2) by signal seems always says 'write will be
blocked', so what this patch does seems to save most cases except
when the any amount of socket buffer is vacated just before the
following select. The second chance to exit from blocked send(2)
won't come after this(, before one more pg_terminate_backend() ?).

Removing the select(2) from the condition (that is,
CHECK_FOR_INTERRUPTS() is called always ProcDiePending is true)
prevents such a possibility, in exchange for killing 'really
live' connection but IMHO it's no problem on intentional server
termination.

More reliable measure for this would be non-blocking IO but it
changes more of the code.

Reproducing the situation.

Another possible question would be about the possibility of such
blocking, or how to reproduce the situation. I found that send(2)
on CentOS6.5 somehow sends successfully, for most cases, the
remaining data at the retry after exiting by signal during being
blocked with buffer full, in spite of no change in environment.

So reproducing the stucked situation is rather difficult on the
server as is. But such situation would be reproduced quite easily
with some cheat, that is, enlarging PQ send buffer, say by ten
times.

Applying the attached testing patch (socket_block_test.patch),
the following steps will make the stucked situation.

1. Do a select which returns big result and enter Ctrl-Z just
after invoking.

cl $ psql -h localhost postgres
cl postgres=# select 1 from generate_series(0, 999);
cl ^Z
cl [4]+ Stopped psql -h localhost postgres

2. Watch the server to stuck.

The server starts to print lines like following to console
after a while, then stops. The number enclosed by the square
brackets is PID of the server.

sv [8809] (bare) rest = 0 / 81920 bytes, ProcDiePending = 0

3. Do

Re: [HACKERS] Escaping from blocked send() reprised.

2014-06-30 Thread Robert Haas

On Mon, Jun 30, 2014 at 4:13 AM, Kyotaro HORIGUCHI
horiguchi.kyot...@lab.ntt.co.jp wrote:
Hello, I have received inquiries related to blocked communication
several times for these weeks with different symptoms. Then I
found this message from archive,

http://postgresql.1045698.n5.nabble.com/Escaping-a-blocked-sendto-syscall-without-causing-a-restart-td5740855.html

Subject: Escaping a blocked sendto() syscall without causing a restart

Mr. Tom Lane gave a comment replying it,

We can't do anything except repeat the send attempt if the client
connection is to be kept in a sane state.
(snipped)
And I'm not at all sure if we could get it to work in SSL mode...

(ProcDiePending select(1, NULL, fd, NULL, '1 sec') == 0)

Finally, pg_terminate_backend() works even when send is blocked
for both SSL and non-SSL connections after 1 second delay with
this patch (break_socket_blocking_on_termination_v1.patch).

Any suggestions?

You should probably add your patch here, so it doesn't get forgotten about:

https://commitfest.postgresql.org/action/commitfest_view/open

We're focused on reviewing patches for the current CommitFest, so your
patch might not get attention right away. A couple of general
thoughts on this topic:

1. I think it's the case that there are platforms around where a
signal won't cause send() to return EINTR and I'd be entirely
unsurprised if SSL_write() doesn't necessarily return EINTR in that
case. I'm not sure what, if anything, we can do about that.

2. I think it would be reasonable to try to kill off the connection
without notifying the client if we're unable to send the data to the
client in a reasonable period of time. But I'm unsure what a
reasonable period of time means. This patch would basically do it
after no delay at all, which seems like it might be too aggressive.
However, I'm not sure.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-06-30 Thread Kyotaro HORIGUCHI

Hello, The replies follow are mainly as a memo for myself so
please don't be bothered to answer until the time comes.

At Mon, 30 Jun 2014 11:27:47 -0400, Robert Haas robertmh...@gmail.com wrote 
in CA+TgmoZfcGzAEmtbyoCe6VdHnq085x+ox752zuJ2AKN=wc8...@mail.gmail.com
 You should probably add your patch here, so it doesn't get forgotten about:
 
 https://commitfest.postgresql.org/action/commitfest_view/open

Ok, I'll put this there. 

 We're focused on reviewing patches for the current CommitFest, so your
 patch might not get attention right away.  A couple of general
 thoughts on this topic:

Thank you for suggestions. I'll consider on them.

 1. I think it's the case that there are platforms around where a
 signal won't cause send() to return EINTR and I'd be entirely
 unsurprised if SSL_write() doesn't necessarily return EINTR in that
 case.  I'm not sure what, if anything, we can do about that.

man 2 send on FreeBSD has not description about EINTR.. And even
on linux, send won't return EINTR for most cases, at least I
haven't seen that. So send()=-1,EINTR seems to me as only an
equivalent of send() = 0. I have no idea about what the
implementer thought the difference is.


 2. I think it would be reasonable to try to kill off the connection
 without notifying the client if we're unable to send the data to the
 client in a reasonable period of time.  But I'm unsure what a
 reasonable period of time means.  This patch would basically do it
 after no delay at all, which seems like it might be too aggressive.
 However, I'm not sure.

I think there's no such a reasonable time. The behavior might
should be determined from another point.. On alternative would be
let pg_terminate_backend() have a parameter instructing force
shutodwn (how to propagate it?), or make a forced shutdown on
duplicate invocation of pg_terminate_backend().

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

67 matches

Mail list logo