Re: [HACKERS] Streaming replication on win32, still broken

2010-02-21 Thread Fujii Masao
On Fri, Feb 19, 2010 at 7:54 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Heikki Linnakangas wrote: Magnus Hagander wrote: Well, it's going to make the process that reads the WAL cause actual physical I/O... That'll take a chunk out of your total available I/O, which is

Re: [HACKERS] Streaming replication on win32, still broken

2010-02-19 Thread Heikki Linnakangas
Heikki Linnakangas wrote: Magnus Hagander wrote: Well, it's going to make the process that reads the WAL cause actual physical I/O... That'll take a chunk out of your total available I/O, which is likely to push you to the limit of your I/O capacity much quicker. Right, doesn't seem

Re: [HACKERS] Streaming replication on win32, still broken

2010-02-18 Thread Magnus Hagander
2010/2/18 Heikki Linnakangas heikki.linnakan...@enterprisedb.com: Fujii Masao wrote: On Thu, Feb 18, 2010 at 5:28 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: If I'm reading the patch correctly, when wal_sync_method is 'open_sync', walreceiver nevertheless opens the WAL

Re: [HACKERS] Streaming replication on win32, still broken

2010-02-18 Thread Heikki Linnakangas
Magnus Hagander wrote: O_DIRECT helps us when we're not going to read the file again, because we don't waste cache on it. If we are, which is the case here, it should be really bad for performance, since we actually have to do a physical read. Incidentally, that should also apply to general

Re: [HACKERS] Streaming replication on win32, still broken

2010-02-18 Thread Fujii Masao
On Thu, Feb 18, 2010 at 7:04 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Magnus Hagander wrote: O_DIRECT helps us when we're not going to read the file again, because we don't waste cache on it. If we are, which is the case here, it should be really bad for performance,

Re: [HACKERS] Streaming replication on win32, still broken

2010-02-18 Thread Magnus Hagander
2010/2/18 Fujii Masao masao.fu...@gmail.com: On Thu, Feb 18, 2010 at 7:04 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Magnus Hagander wrote: O_DIRECT helps us when we're not going to read the file again, because we don't waste cache on it. If we are, which is the case

Re: [HACKERS] Streaming replication on win32, still broken

2010-02-18 Thread Heikki Linnakangas
Magnus Hagander wrote: 2010/2/18 Fujii Masao masao.fu...@gmail.com: On Thu, Feb 18, 2010 at 7:04 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Magnus Hagander wrote: O_DIRECT helps us when we're not going to read the file again, because we don't waste cache on it. If we

Re: [HACKERS] Streaming replication on win32, still broken

2010-02-17 Thread Fujii Masao
On Wed, Feb 17, 2010 at 4:07 PM, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Feb 17, 2010 at 3:03 PM, Magnus Hagander mag...@hagander.net wrote: In that case, O_DIRECT would be counterproductive, no? It maps to FILE_FLAG_NOI_BUFFERING, which makes sure it doesn't go into the cache. So the

Re: [HACKERS] Streaming replication on win32, still broken

2010-02-17 Thread Fujii Masao
On Wed, Feb 17, 2010 at 6:00 PM, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Feb 17, 2010 at 4:07 PM, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Feb 17, 2010 at 3:03 PM, Magnus Hagander mag...@hagander.net wrote: In that case, O_DIRECT would be counterproductive, no? It maps to

Re: [HACKERS] Streaming replication on win32, still broken

2010-02-17 Thread Heikki Linnakangas
Fujii Masao wrote: On Wed, Feb 17, 2010 at 6:00 PM, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Feb 17, 2010 at 4:07 PM, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Feb 17, 2010 at 3:03 PM, Magnus Hagander mag...@hagander.net wrote: In that case, O_DIRECT would be

Re: [HACKERS] Streaming replication on win32, still broken

2010-02-17 Thread Fujii Masao
On Thu, Feb 18, 2010 at 5:28 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: If I'm reading the patch correctly, when wal_sync_method is 'open_sync', walreceiver nevertheless opens the WAL file without the O_DIRECT flag. When it later flushes it in XLogWalRcvFlush() by

Re: [HACKERS] Streaming replication on win32, still broken

2010-02-17 Thread Heikki Linnakangas
Fujii Masao wrote: On Thu, Feb 18, 2010 at 5:28 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: If I'm reading the patch correctly, when wal_sync_method is 'open_sync', walreceiver nevertheless opens the WAL file without the O_DIRECT flag. When it later flushes it in

Re: [HACKERS] Streaming replication on win32, still broken

2010-02-16 Thread Fujii Masao
On Tue, Feb 16, 2010 at 12:37 AM, Magnus Hagander mag...@hagander.net wrote: With the libpq fixes, I get further (more on that fix later, btw), but now I get stuck in this. When I do something on the master that generates WAL, such as insert a record, and then try to query this on the slave,

Re: [HACKERS] Streaming replication on win32, still broken

2010-02-16 Thread Magnus Hagander
2010/2/16 Fujii Masao masao.fu...@gmail.com: On Tue, Feb 16, 2010 at 12:37 AM, Magnus Hagander mag...@hagander.net wrote: With the libpq fixes, I get further (more on that fix later, btw), but now I get stuck in this. When I do something on the master that generates WAL, such as insert a

Re: [HACKERS] Streaming replication on win32, still broken

2010-02-16 Thread Fujii Masao
On Tue, Feb 16, 2010 at 7:20 PM, Magnus Hagander mag...@hagander.net wrote: 2010/2/16 Fujii Masao masao.fu...@gmail.com: On Tue, Feb 16, 2010 at 12:37 AM, Magnus Hagander mag...@hagander.net wrote: With the libpq fixes, I get further (more on that fix later, btw), but now I get stuck in

Re: [HACKERS] Streaming Replication on win32

2010-02-16 Thread Magnus Hagander
2010/2/16 Fujii Masao masao.fu...@gmail.com: On Tue, Feb 16, 2010 at 1:33 AM, Magnus Hagander mag...@hagander.net wrote: 2010/2/15 Tom Lane t...@sss.pgh.pa.us: Magnus Hagander mag...@hagander.net writes: I changed your patch to this, because I find it a lot simpler. The change is in the

Re: [HACKERS] Streaming replication on win32, still broken

2010-02-16 Thread Magnus Hagander
2010/2/16 Fujii Masao masao.fu...@gmail.com: On Tue, Feb 16, 2010 at 7:20 PM, Magnus Hagander mag...@hagander.net wrote: 2010/2/16 Fujii Masao masao.fu...@gmail.com: On Tue, Feb 16, 2010 at 12:37 AM, Magnus Hagander mag...@hagander.net wrote: With the libpq fixes, I get further (more on that

Re: [HACKERS] Streaming replication on win32, still broken

2010-02-16 Thread Fujii Masao
On Wed, Feb 17, 2010 at 6:28 AM, Magnus Hagander mag...@hagander.net wrote: If you send me your amazon id, I can get you premissions on my private image. I plan to clean it up and make it public, just haven't gotten around to it yet... Thanks for your concern! I'll send the ID when I complete

Re: [HACKERS] Streaming replication on win32, still broken

2010-02-16 Thread Magnus Hagander
On Wed, Feb 17, 2010 at 06:55, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Feb 17, 2010 at 6:28 AM, Magnus Hagander mag...@hagander.net wrote: If you send me your amazon id, I can get you premissions on my private image. I plan to clean it up and make it public, just haven't gotten around

Re: [HACKERS] Streaming replication on win32, still broken

2010-02-16 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes: On Wed, Feb 17, 2010 at 06:55, Fujii Masao masao.fu...@gmail.com wrote: 2. Straightforwardly observe the alignment rule. Since the received WAL   data might start at the middle of WAL block, walreceiver needs to keep   the last half-written WAL block

Re: [HACKERS] Streaming replication on win32, still broken

2010-02-16 Thread Fujii Masao
On Wed, Feb 17, 2010 at 3:03 PM, Magnus Hagander mag...@hagander.net wrote: In that case, O_DIRECT would be counterproductive, no? It maps to FILE_FLAG_NOI_BUFFERING, which makes sure it doesn't go into the cache. So the read in the startup proc is actually guaranteed to reuqire a physical

Re: [HACKERS] Streaming replication on win32, still broken

2010-02-16 Thread Fujii Masao
On Wed, Feb 17, 2010 at 3:27 PM, Tom Lane t...@sss.pgh.pa.us wrote: Magnus Hagander mag...@hagander.net writes: On Wed, Feb 17, 2010 at 06:55, Fujii Masao masao.fu...@gmail.com wrote: 2. Straightforwardly observe the alignment rule. Since the received WAL   data might start at the middle of

[HACKERS] Streaming replication on win32, still broken

2010-02-15 Thread Magnus Hagander
With the libpq fixes, I get further (more on that fix later, btw), but now I get stuck in this. When I do something on the master that generates WAL, such as insert a record, and then try to query this on the slave, the walreceiver process crashes with: PANIC: XX000: could not write to log file

Re: [HACKERS] Streaming Replication on win32

2010-02-15 Thread Magnus Hagander
2010/2/15 Fujii Masao masao.fu...@gmail.com: On Sun, Feb 14, 2010 at 11:52 PM, Magnus Hagander mag...@hagander.net wrote: Remember that the win32 code *always* puts the socket in non-blocking mode. So we can't just teach the layer about it. We need some way to pass the information down that

Re: [HACKERS] Streaming Replication on win32

2010-02-15 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes: I changed your patch to this, because I find it a lot simpler. The change is in the checking in pgwin32_recv - there is no need to ever call waitforsinglesocket, we can just exit out early. Do you see any issue with that? This definitely looks

Re: [HACKERS] Streaming Replication on win32

2010-02-15 Thread Magnus Hagander
2010/2/15 Tom Lane t...@sss.pgh.pa.us: Magnus Hagander mag...@hagander.net writes: I changed your patch to this, because I find it a lot simpler. The change is in the checking in pgwin32_recv - there is no need to ever call waitforsinglesocket, we can just exit out early. Do you see any

Re: [HACKERS] Streaming Replication on win32

2010-02-15 Thread Fujii Masao
On Tue, Feb 16, 2010 at 1:33 AM, Magnus Hagander mag...@hagander.net wrote: 2010/2/15 Tom Lane t...@sss.pgh.pa.us: Magnus Hagander mag...@hagander.net writes: I changed your patch to this, because I find it a lot simpler. The change is in the checking in pgwin32_recv - there is no need to ever

Re: [HACKERS] Streaming Replication on win32

2010-02-14 Thread Magnus Hagander
2010/2/8 Fujii Masao masao.fu...@gmail.com: On Mon, Jan 18, 2010 at 11:46 PM, Magnus Hagander mag...@hagander.net wrote: From what I can tell, this indicates that pq_getbyte_if_available() is not working - because it's supposed to never block, right? Right, it's not supposed to block. This

Re: [HACKERS] Streaming Replication on win32

2010-02-14 Thread Fujii Masao
On Sun, Feb 14, 2010 at 11:52 PM, Magnus Hagander mag...@hagander.net wrote: Sorry about the delay in responding to this. Thanks for the response. Remember that the win32 code *always* puts the socket in non-blocking mode. So we can't just teach the layer about it. We need some way to pass

Re: [HACKERS] Streaming Replication on win32

2010-02-08 Thread Fujii Masao
On Mon, Jan 18, 2010 at 11:46 PM, Magnus Hagander mag...@hagander.net wrote: From what I can tell, this indicates that pq_getbyte_if_available() is not working - because it's supposed to never block, right? Right, it's not supposed to block. This could be because the win32 socket emulation

Re: [HACKERS] Streaming Replication on win32

2010-01-24 Thread Joe Conway
On 01/21/2010 11:19 PM, Heikki Linnakangas wrote: Joe Conway wrote: OK, so now I see why we want this fixed for dblink and walreceiver, but doesn't this approach leave every other WIN32 libpq client out in the cold? Is there nothing that can be done for the general case, or is it a SMOP?

Re: [HACKERS] Streaming Replication on win32

2010-01-24 Thread Magnus Hagander
2010/1/24 Joe Conway m...@joeconway.com: On 01/21/2010 11:19 PM, Heikki Linnakangas wrote: Joe Conway wrote: OK, so now I see why we want this fixed for dblink and walreceiver, but doesn't this approach leave every other WIN32 libpq client out in the cold? Is there nothing that can be done

Re: [HACKERS] Streaming Replication on win32

2010-01-24 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes: 2010/1/24 Joe Conway m...@joeconway.com: Sorry for being thick -- I'm still missing something. I don't understand why any user program using libpq/PQexec running on Windows does not have the same issue. Or to put it another way, why does this only

Re: [HACKERS] Streaming Replication on win32

2010-01-24 Thread Heikki Linnakangas
Tom Lane wrote: Magnus Hagander mag...@hagander.net writes: 2010/1/24 Joe Conway m...@joeconway.com: Sorry for being thick -- I'm still missing something. I don't understand why any user program using libpq/PQexec running on Windows does not have the same issue. Or to put it another way, why

Re: [HACKERS] Streaming Replication on win32

2010-01-22 Thread Dimitri Fontaine
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: Joe Conway wrote: OK, so now I see why we want this fixed for dblink and walreceiver, but doesn't this approach leave every other WIN32 libpq client out in the cold? Is there nothing that can be done for the general case, or is it

Re: [HACKERS] Streaming Replication on win32

2010-01-22 Thread Marko Kreen
On 1/22/10, Dimitri Fontaine dfonta...@hi-media.com wrote: Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: Joe Conway wrote: OK, so now I see why we want this fixed for dblink and walreceiver, but doesn't this approach leave every other WIN32 libpq client out in the

Re: [HACKERS] Streaming Replication on win32

2010-01-22 Thread Heikki Linnakangas
Marko Kreen wrote: On 1/22/10, Dimitri Fontaine dfonta...@hi-media.com wrote: Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: The problem only applies to libpq calls from the backend. Client apps are not affected, only backend modules. If there's any other modules out

Re: [HACKERS] Streaming Replication on win32

2010-01-21 Thread Heikki Linnakangas
Heikki Linnakangas wrote: Magnus Hagander wrote: 2010/1/17 Heikki Linnakangas heikki.linnakan...@enterprisedb.com: We could replace the blocking PQexec() calls with PQsendQuery(), and use the emulated version of select() to wait. Hmm. That would at least theoretically work, but aren't there

Re: [HACKERS] Streaming Replication on win32

2010-01-21 Thread Joe Conway
On 01/21/2010 04:46 AM, Heikki Linnakangas wrote: Heikki Linnakangas wrote: Magnus Hagander wrote: 2010/1/17 Heikki Linnakangas heikki.linnakan...@enterprisedb.com: We could replace the blocking PQexec() calls with PQsendQuery(), and use the emulated version of select() to wait. Hmm. That

Re: [HACKERS] Streaming Replication on win32

2010-01-21 Thread Heikki Linnakangas
Joe Conway wrote: +#ifdef WIN23 ^ I assume you meant WIN32 here ;-) Yeah. I admit I haven't tested this on Windows, I just commented out those #ifdef's and tested on Linux. Will need to verify that this actually solves the problem on Windows before committing. +#define

Re: [HACKERS] Streaming Replication on win32

2010-01-21 Thread Joe Conway
On 01/21/2010 10:33 PM, Heikki Linnakangas wrote: Joe Conway wrote: I have not been really following this thread, but why can't we put the #ifdef WIN32 and special definition of these functions into libpq. I don't understand why we need special treatment for dblink. The problem is that

Re: [HACKERS] Streaming Replication on win32

2010-01-21 Thread Heikki Linnakangas
Joe Conway wrote: OK, so now I see why we want this fixed for dblink and walreceiver, but doesn't this approach leave every other WIN32 libpq client out in the cold? Is there nothing that can be done for the general case, or is it a SMOP? The problem only applies to libpq calls from the

Re: [HACKERS] Streaming Replication on win32

2010-01-18 Thread Fujii Masao
On Mon, Jan 18, 2010 at 5:22 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: This could be because the win32 socket emulation layer simply wasn't designed to deal with non-blocking sockets. Specifically, it actually *always* sets the socket to non-blocking mode, and then uses

Re: [HACKERS] Streaming Replication on win32

2010-01-18 Thread Magnus Hagander
On Mon, Jan 18, 2010 at 10:30, Fujii Masao masao.fu...@gmail.com wrote: On Mon, Jan 18, 2010 at 5:22 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: This could be because the win32 socket emulation layer simply wasn't designed to deal with non-blocking sockets. Specifically,

Re: [HACKERS] Streaming Replication on win32

2010-01-18 Thread Fujii Masao
On Mon, Jan 18, 2010 at 6:40 PM, Magnus Hagander mag...@hagander.net wrote: SSL_read calls into pqwin32_recv(), so you have the same problem. (see my_sock_read() and my_sock_write() in be-secure.c) Oh, I confirmed that. Thanks! Can we prevent SSL_read from being blocked in the renegotiation

Re: [HACKERS] Streaming Replication on win32

2010-01-18 Thread Magnus Hagander
2010/1/18 Tom Lane t...@sss.pgh.pa.us: Magnus Hagander mag...@hagander.net writes: Which shows one potentially big problem - since we're calling select() from inside libpq, it's not calling our signal emulation layer compatible select(). This means that at this point, walreceiver is not

Re: [HACKERS] Streaming Replication on win32

2010-01-18 Thread Magnus Hagander
2010/1/17 Heikki Linnakangas heikki.linnakan...@enterprisedb.com: Magnus Hagander wrote: Which shows one potentially big problem - since we're calling select() from inside libpq, it's not calling our signal emulation layer compatible select(). This means that at this point, walreceiver is not

Re: [HACKERS] Streaming Replication on win32

2010-01-18 Thread Heikki Linnakangas
Magnus Hagander wrote: 2010/1/17 Heikki Linnakangas heikki.linnakan...@enterprisedb.com: We could replace the blocking PQexec() calls with PQsendQuery(), and use the emulated version of select() to wait. Hmm. That would at least theoretically work, but aren't there still places we may end

[HACKERS] Streaming Replication on win32

2010-01-17 Thread Magnus Hagander
I'm trying to figure out why streaming replication doesn't work on win32. Here is what I have so far: It starts up fine, and outputs: LOG: starting archive recovery LOG: standby_mode = 'on' LOG: primary_conninfo = 'host=localhost port=5432' LOG: starting streaming recovery at 0/200 After

Re: [HACKERS] Streaming Replication on win32

2010-01-17 Thread Heikki Linnakangas
Magnus Hagander wrote: Which shows one potentially big problem - since we're calling select() from inside libpq, it's not calling our signal emulation layer compatible select(). This means that at this point, walreceiver is not interruptible. Which also shows itself if I shut down the system -

Re: [HACKERS] Streaming Replication on win32

2010-01-17 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes: Which shows one potentially big problem - since we're calling select() from inside libpq, it's not calling our signal emulation layer compatible select(). This means that at this point, walreceiver is not interruptible. Ugh. Which also shows