Re: Review of pg_basebackup and pg_receivexlog to use non-blocking socket communication, was: Re: [HACKERS] Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown
On Tue, Jan 22, 2013 at 7:31 AM, Amit Kapila amit.kap...@huawei.com wrote: On Monday, January 21, 2013 6:22 PM Magnus Hagander On Fri, Jan 18, 2013 at 7:50 AM, Amit Kapila amit.kap...@huawei.com wrote: On Wednesday, January 16, 2013 4:02 PM Heikki Linnakangas wrote: On 07.01.2013 16:23, Boszormenyi Zoltan wrote: Since my other patch against pg_basebackup is now committed, this patch doesn't apply cleanly, patch rejects 2 hunks. The fixed up patch is attached. Now that I look at this a high-level perspective, why are we only worried about timeouts in the Copy-mode and when connecting? The initial checkpoint could take a long time too, and if the server turns into a black hole while the checkpoint is running, pg_basebackup will still hang. Then again, a short timeout on that phase would be a bad idea, because the checkpoint can indeed take a long time. True, but IMO, if somebody want to take basebackup, he should do that when the server is not loaded. A lot of installations don't have such an optino, because there is no time whe nthe server is not loaded. Good to know about it. I have always heard that customer will run background maintenance activities (Reindex, Vacuum Full, etc) when the server is less loaded. For example a. Billing applications in telecom, at night times they can be relatively less loaded. That assumes there is a nighttime.. If you're operating in enough timezones, that won't happen. b. Any databases used for Sensex transactions, they will be relatively free once the market is closed. c. Banking solutions, because transactions are done mostly in day times. True. But those are definitely very very narrow usecases ;) Don't get me wrong. There are a *lot* of people who have nighttimes to do maintenance in. They are the lucky ones :) But we can't assume this scenario. There will be many cases where Database server will be loaded all the times, if you can give some example, it will be a good learning for me. Most internet based businesses that do business in multiple countries. Or really, any business that has customers in multiple timezones across the world. And even more to the point, any business who's *customers* have customers in multiple timezones across the world, provided they are services-based. In streaming replication, the keep-alive messages carry additional information, the timestamps and WAL locations, so a keepalive makes sense at that level. But otherwise, aren't we just trying to reimplement TCP keepalives? TCP keepalives are not perfect, but if we want to have an application level timeout, it should be implemented in the FE/BE protocol. I don't think we need to do anything specific to pg_basebackup. The user can simply specify TCP keepalive settings in the connection string, like with any libpq program. I think currently user has no way to specify TCP keepalive settings from pg_basebackup, please let me know if there is any such existing way? You can set it through environment variables. As was discussed elsewhere, it would be good to have the ability to do it natively to pg_basebackup as well. Sure, already modifying the existing patch to support connection string in pg_basebackup and pg_receivexlog. Good. I think specifying TCP settings is very cumbersome for most users, that's the reason most standard interfaces (ODBC/JDBC) have such application level timeout mechanism. By implementing in FE/BE protocol (do you mean to say that make such non-blocking behavior inside Libpq or something else), it might be generic and can be used for others as well but it might need few interface changes. If it's specifying them that is cumbersome, then that's the part we should fix, rather than modifying the protocol, no? That can be done as part of point 2 of initial proposal (2. Support recv_timeout separately to provide a way to users who are not comfortable tcp keepalives). Looking at the bigger picture, we should in that case support those on *all* our frontend applications, and not just pg_basebackup. To me, it makes more sense to just say use the connection string method to connect when you need to set these parameters. There are always going to be some parameters that require that. To achieve this there can be 2 ways. 1. Change in FE/BE protocol - I am not sure exactly how this can be done, but as per Heikki this is better way of implementing it. 2. Make the socket as non-blocking in pg_basebackup. Advantage of Approach-1 is that if we do in such a fashion that in lower layers (libpq) it is addressed then all other apps (pg_basebackup, etc) can use it, no need to handle separately in each application. So now as changes in Approach-1 seems to be invasive, we decided to do it later. Ok - I haven't really been following the thread, but that doesn't seem unreasonable. The thing I was objecting to is putting in special parameters to
Re: Review of pg_basebackup and pg_receivexlog to use non-blocking socket communication, was: Re: [HACKERS] Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown
On Fri, Jan 18, 2013 at 7:50 AM, Amit Kapila amit.kap...@huawei.com wrote: On Wednesday, January 16, 2013 4:02 PM Heikki Linnakangas wrote: On 07.01.2013 16:23, Boszormenyi Zoltan wrote: Since my other patch against pg_basebackup is now committed, this patch doesn't apply cleanly, patch rejects 2 hunks. The fixed up patch is attached. Now that I look at this a high-level perspective, why are we only worried about timeouts in the Copy-mode and when connecting? The initial checkpoint could take a long time too, and if the server turns into a black hole while the checkpoint is running, pg_basebackup will still hang. Then again, a short timeout on that phase would be a bad idea, because the checkpoint can indeed take a long time. True, but IMO, if somebody want to take basebackup, he should do that when the server is not loaded. A lot of installations don't have such an optino, because there is no time whe nthe server is not loaded. In streaming replication, the keep-alive messages carry additional information, the timestamps and WAL locations, so a keepalive makes sense at that level. But otherwise, aren't we just trying to reimplement TCP keepalives? TCP keepalives are not perfect, but if we want to have an application level timeout, it should be implemented in the FE/BE protocol. I don't think we need to do anything specific to pg_basebackup. The user can simply specify TCP keepalive settings in the connection string, like with any libpq program. I think currently user has no way to specify TCP keepalive settings from pg_basebackup, please let me know if there is any such existing way? You can set it through environment variables. As was discussed elsewhere, it would be good to have the ability to do it natively to pg_basebackup as well. I think specifying TCP settings is very cumbersome for most users, that's the reason most standard interfaces (ODBC/JDBC) have such application level timeout mechanism. By implementing in FE/BE protocol (do you mean to say that make such non-blocking behavior inside Libpq or something else), it might be generic and can be used for others as well but it might need few interface changes. If it's specifying them that is cumbersome, then that's the part we should fix, rather than modifying the protocol, no? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Review of pg_basebackup and pg_receivexlog to use non-blocking socket communication, was: Re: [HACKERS] Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown
On Monday, January 21, 2013 6:22 PM Magnus Hagander On Fri, Jan 18, 2013 at 7:50 AM, Amit Kapila amit.kap...@huawei.com wrote: On Wednesday, January 16, 2013 4:02 PM Heikki Linnakangas wrote: On 07.01.2013 16:23, Boszormenyi Zoltan wrote: Since my other patch against pg_basebackup is now committed, this patch doesn't apply cleanly, patch rejects 2 hunks. The fixed up patch is attached. Now that I look at this a high-level perspective, why are we only worried about timeouts in the Copy-mode and when connecting? The initial checkpoint could take a long time too, and if the server turns into a black hole while the checkpoint is running, pg_basebackup will still hang. Then again, a short timeout on that phase would be a bad idea, because the checkpoint can indeed take a long time. True, but IMO, if somebody want to take basebackup, he should do that when the server is not loaded. A lot of installations don't have such an optino, because there is no time whe nthe server is not loaded. Good to know about it. I have always heard that customer will run background maintenance activities (Reindex, Vacuum Full, etc) when the server is less loaded. For example a. Billing applications in telecom, at night times they can be relatively less loaded. b. Any databases used for Sensex transactions, they will be relatively free once the market is closed. c. Banking solutions, because transactions are done mostly in day times. There will be many cases where Database server will be loaded all the times, if you can give some example, it will be a good learning for me. In streaming replication, the keep-alive messages carry additional information, the timestamps and WAL locations, so a keepalive makes sense at that level. But otherwise, aren't we just trying to reimplement TCP keepalives? TCP keepalives are not perfect, but if we want to have an application level timeout, it should be implemented in the FE/BE protocol. I don't think we need to do anything specific to pg_basebackup. The user can simply specify TCP keepalive settings in the connection string, like with any libpq program. I think currently user has no way to specify TCP keepalive settings from pg_basebackup, please let me know if there is any such existing way? You can set it through environment variables. As was discussed elsewhere, it would be good to have the ability to do it natively to pg_basebackup as well. Sure, already modifying the existing patch to support connection string in pg_basebackup and pg_receivexlog. I think specifying TCP settings is very cumbersome for most users, that's the reason most standard interfaces (ODBC/JDBC) have such application level timeout mechanism. By implementing in FE/BE protocol (do you mean to say that make such non-blocking behavior inside Libpq or something else), it might be generic and can be used for others as well but it might need few interface changes. If it's specifying them that is cumbersome, then that's the part we should fix, rather than modifying the protocol, no? That can be done as part of point 2 of initial proposal (2. Support recv_timeout separately to provide a way to users who are not comfortable tcp keepalives). To achieve this there can be 2 ways. 1. Change in FE/BE protocol - I am not sure exactly how this can be done, but as per Heikki this is better way of implementing it. 2. Make the socket as non-blocking in pg_basebackup. Advantage of Approach-1 is that if we do in such a fashion that in lower layers (libpq) it is addressed then all other apps (pg_basebackup, etc) can use it, no need to handle separately in each application. So now as changes in Approach-1 seems to be invasive, we decided to do it later. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Review of pg_basebackup and pg_receivexlog to use non-blocking socket communication, was: Re: [HACKERS] Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown
On Wednesday, January 16, 2013 4:02 PM Heikki Linnakangas wrote: On 07.01.2013 16:23, Boszormenyi Zoltan wrote: Since my other patch against pg_basebackup is now committed, this patch doesn't apply cleanly, patch rejects 2 hunks. The fixed up patch is attached. Now that I look at this a high-level perspective, why are we only worried about timeouts in the Copy-mode and when connecting? The initial checkpoint could take a long time too, and if the server turns into a black hole while the checkpoint is running, pg_basebackup will still hang. Then again, a short timeout on that phase would be a bad idea, because the checkpoint can indeed take a long time. True, but IMO, if somebody want to take basebackup, he should do that when the server is not loaded. In streaming replication, the keep-alive messages carry additional information, the timestamps and WAL locations, so a keepalive makes sense at that level. But otherwise, aren't we just trying to reimplement TCP keepalives? TCP keepalives are not perfect, but if we want to have an application level timeout, it should be implemented in the FE/BE protocol. I don't think we need to do anything specific to pg_basebackup. The user can simply specify TCP keepalive settings in the connection string, like with any libpq program. I think currently user has no way to specify TCP keepalive settings from pg_basebackup, please let me know if there is any such existing way? I think specifying TCP settings is very cumbersome for most users, that's the reason most standard interfaces (ODBC/JDBC) have such application level timeout mechanism. By implementing in FE/BE protocol (do you mean to say that make such non-blocking behavior inside Libpq or something else), it might be generic and can be used for others as well but it might need few interface changes. IMHO if by having such less impact changes for pg_basebackup, it makes pg_basebackup network sensitive, the current approach can also be considered. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Review of pg_basebackup and pg_receivexlog to use non-blocking socket communication, was: Re: [HACKERS] Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown
On 07.01.2013 16:23, Boszormenyi Zoltan wrote: Since my other patch against pg_basebackup is now committed, this patch doesn't apply cleanly, patch rejects 2 hunks. The fixed up patch is attached. Now that I look at this a high-level perspective, why are we only worried about timeouts in the Copy-mode and when connecting? The initial checkpoint could take a long time too, and if the server turns into a black hole while the checkpoint is running, pg_basebackup will still hang. Then again, a short timeout on that phase would be a bad idea, because the checkpoint can indeed take a long time. In streaming replication, the keep-alive messages carry additional information, the timestamps and WAL locations, so a keepalive makes sense at that level. But otherwise, aren't we just trying to reimplement TCP keepalives? TCP keepalives are not perfect, but if we want to have an application level timeout, it should be implemented in the FE/BE protocol. I don't think we need to do anything specific to pg_basebackup. The user can simply specify TCP keepalive settings in the connection string, like with any libpq program. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Review of pg_basebackup and pg_receivexlog to use non-blocking socket communication, was: Re: [HACKERS] Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown
On January 07, 2013 7:53 PM Boszormenyi Zoltan wrote: Since my other patch against pg_basebackup is now committed, this patch doesn't apply cleanly, patch rejects 2 hunks. The fixed up patch is attached. Patch is verified. Thanks for rebasing the patch. Regards, Hari babu. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Review of pg_basebackup and pg_receivexlog to use non-blocking socket communication, was: Re: [HACKERS] Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown
2013-01-04 13:43 keltezéssel, Hari Babu írta: On January 02, 2013 12:41 PM Hari Babu wrote: On January 01, 2013 10:19 PM Boszormenyi Zoltan wrote: I am reviewing your patch. • Is the patch in context diff format? Yes. Thanks for reviewing the patch. • Does it apply cleanly to the current git master? Not quite cleanly but it doesn't produce rejects or fuzz, only offset warnings: Will rebase the patch to head. • Does it include reasonable tests, necessary doc patches, etc? The test cases are not applicable. There is no test framework for testing network outage in make check. There are no documentation patches for the new --recvtimeout=INTERVAL and --conntimeout=INTERVAL options for either pg_basebackup or pg_receivexlog. I will add the documentation for the same. Per the previous comment, no. But those are for the backend to notice network breakdowns and as such, they need a separate patch. I also think it is better to handle it as a separate patch for walsender. • Are the comments sufficient and accurate? This chunk below removes a comment which seems obvious enough so it's not needed: *** *** 518,524 ReceiveXlogStream(PGconn *conn, XLogRecPtr startpos, uint32 timeline, goto error; } ! /* Check the message type. */ if (copybuf[0] == 'k') { int pos; --- 559,568 goto error; } ! /* Set the last reply timestamp */ ! last_recv_timestamp = localGetCurrentTimestamp(); ! ping_sent = false; ! if (copybuf[0] == 'k') { int pos; *** Other comments are sufficient and accurate. I will fix and update the patch. The attached V2 patch in the mail handles all the review comments identified above. Regards, Hari babu. Since my other patch against pg_basebackup is now committed, this patch doesn't apply cleanly, patch rejects 2 hunks. The fixed up patch is attached. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ diff -dcrpN postgresql.orig/doc/src/sgml/ref/pg_basebackup.sgml postgresql/doc/src/sgml/ref/pg_basebackup.sgml *** postgresql.orig/doc/src/sgml/ref/pg_basebackup.sgml 2013-01-05 17:34:30.742135371 +0100 --- postgresql/doc/src/sgml/ref/pg_basebackup.sgml 2013-01-07 15:11:40.787007890 +0100 *** PostgreSQL documentation *** 400,405 --- 400,425 /varlistentry varlistentry + termoption-r replaceable class=parameterinterval/replaceable/option/term + termoption--recvtimeout=replaceable class=parameterinterval/replaceable/option/term + listitem +para + time that receiver waits for communication from server (in seconds). +/para + /listitem + /varlistentry + + varlistentry + termoption-t replaceable class=parameterinterval/replaceable/option/term + termoption--conntimeout=replaceable class=parameterinterval/replaceable/option/term + listitem +para + time that client wait for connection to establish with server (in seconds). +/para + /listitem + /varlistentry + + varlistentry termoption-U replaceableusername/replaceable/option/term termoption--username=replaceable class=parameterusername/replaceable/option/term listitem diff -dcrpN postgresql.orig/doc/src/sgml/ref/pg_receivexlog.sgml postgresql/doc/src/sgml/ref/pg_receivexlog.sgml *** postgresql.orig/doc/src/sgml/ref/pg_receivexlog.sgml 2012-11-08 13:13:04.152630639 +0100 --- postgresql/doc/src/sgml/ref/pg_receivexlog.sgml 2013-01-07 15:11:40.788007898 +0100 *** PostgreSQL documentation *** 164,169 --- 164,189 /varlistentry varlistentry + termoption-r replaceable class=parameterinterval/replaceable/option/term + termoption--recvtimeout=replaceable class=parameterinterval/replaceable/option/term + listitem +para + time that receiver waits for communication from server (in seconds). +/para + /listitem + /varlistentry + + varlistentry + termoption-t replaceable class=parameterinterval/replaceable/option/term + termoption--conntimeout=replaceable class=parameterinterval/replaceable/option/term + listitem +para + time that client wait for connection to establish with server (in seconds). +/para + /listitem + /varlistentry + + varlistentry termoption-U replaceableusername/replaceable/option/term termoption--username=replaceable class=parameterusername/replaceable/option/term
Re: Review of pg_basebackup and pg_receivexlog to use non-blocking socket communication, was: Re: [HACKERS] Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown
On January 02, 2013 12:41 PM Hari Babu wrote: On January 01, 2013 10:19 PM Boszormenyi Zoltan wrote: I am reviewing your patch. Is the patch in context diff format? Yes. Thanks for reviewing the patch. Does it apply cleanly to the current git master? Not quite cleanly but it doesn't produce rejects or fuzz, only offset warnings: Will rebase the patch to head. Does it include reasonable tests, necessary doc patches, etc? The test cases are not applicable. There is no test framework for testing network outage in make check. There are no documentation patches for the new --recvtimeout=INTERVAL and --conntimeout=INTERVAL options for either pg_basebackup or pg_receivexlog. I will add the documentation for the same. Per the previous comment, no. But those are for the backend to notice network breakdowns and as such, they need a separate patch. I also think it is better to handle it as a separate patch for walsender. Are the comments sufficient and accurate? This chunk below removes a comment which seems obvious enough so it's not needed: *** *** 518,524 ReceiveXlogStream(PGconn *conn, XLogRecPtr startpos, uint32 timeline, goto error; } ! /* Check the message type. */ if (copybuf[0] == 'k') { int pos; --- 559,568 goto error; } ! /* Set the last reply timestamp */ ! last_recv_timestamp = localGetCurrentTimestamp(); ! ping_sent = false; ! if (copybuf[0] == 'k') { int pos; *** Other comments are sufficient and accurate. I will fix and update the patch. The attached V2 patch in the mail handles all the review comments identified above. Regards, Hari babu. pg_basebkup_recvxlog_noblock_comm_v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Review of pg_basebackup and pg_receivexlog to use non-blocking socket communication, was: Re: [HACKERS] Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown
On January 01, 2013 10:19 PM Boszormenyi Zoltan wrote: I am reviewing your patch. Is the patch in context diff format? Yes. Thanks for reviewing the patch. Does it apply cleanly to the current git master? Not quite cleanly but it doesn't produce rejects or fuzz, only offset warnings: Will rebase the patch to head. Does it include reasonable tests, necessary doc patches, etc? The test cases are not applicable. There is no test framework for testing network outage in make check. There are no documentation patches for the new --recvtimeout=INTERVAL and --conntimeout=INTERVAL options for either pg_basebackup or pg_receivexlog. I will add the documentation for the same. Per the previous comment, no. But those are for the backend to notice network breakdowns and as such, they need a separate patch. I also think it is better to handle it as a separate patch for walsender. Are the comments sufficient and accurate? This chunk below removes a comment which seems obvious enough so it's not needed: *** *** 518,524 ReceiveXlogStream(PGconn *conn, XLogRecPtr startpos, uint32 timeline, goto error; } ! /* Check the message type. */ if (copybuf[0] == 'k') { int pos; --- 559,568 goto error; } ! /* Set the last reply timestamp */ ! last_recv_timestamp = localGetCurrentTimestamp(); ! ping_sent = false; ! if (copybuf[0] == 'k') { int pos; *** Other comments are sufficient and accurate. I will fix and update the patch. Please let me know if anything apart from above needs to be taken care. Regards, Hari babu. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers