Re: [Pgpool-general] Order of Query Results
On Thursday, August 05, 2010 06:26:46 pm Tatsuo Ishii wrote: Seems like pgpool issue. What I am wondering is this line: 2010-08-02 13:56:21 DEBUG: pid 19720: AsciiRow: len: 1 data: t This suggests that your DB application uses very old PostgreSQL protocol(call version 2 protocol), implemented in PostgreSQL 7.3 or before. But you said your PostgreSQL is 8.4.4. Is there anything special with your DB application? -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp Is there anything special with your DB application? Ya, to put it nicely, it's old as dirt thanks to about a decade of bad decisions by people who shouldn't be allowed to decide what to have for lunch. Sorry, I forgot to add the primary user of that table is a custom ordering system that currently runs Python 1.5 using _pg that was most likely built against PostgreSQL before 7.2 I can try to see if we can move it to using something built against 8.3 (or 8.4 if I can convince it to install on RHEL2.1) though, maybe in a week or two. Is it possible that that is confusing pgpool? I think it's possible. Version 2 protocol is hard to manage, fragile and the code for the protcol in pgpool-II is complex. Using 8.3 or 8.4 (whatever version 7.4 or later) will make pgpool-II more robust and efficient. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp Alright, we'll move to psycopg built against 8.3 as soon as we can get back to the list with results. ___ Pgpool-general mailing list Pgpool-general@pgfoundry.org http://pgfoundry.org/mailman/listinfo/pgpool-general
Re: [Pgpool-general] Order of Query Results
On Wednesday, August 04, 2010 07:43:40 pm Tatsuo Ishii wrote: Ok I have some more information. The cluster fell apart again yesterday. Again, there was nothing out of the ordinary in the postgresql logs leading upto the nodes falling out but this was logged in the pgpool log First node fell out at 10:36am: 2010-08-02 10:36:47 DEBUG: pid 16699: AsciiRow: 24 th field size does not match between master(16777216) and 2 th backend(0) 2010-08-02 10:36:47 DEBUG: pid 16699: read_kind_from_backend: read kind from 0 th backend ^@ NUM_BACKENDS: 3 2010-08-02 10:36:47 DEBUG: pid 16699: read_kind_from_backend: read kind from 1 th backend ^@ NUM_BACKENDS: 3 2010-08-02 10:36:47 DEBUG: pid 16699: read_kind_from_backend: read kind from 2 th backend D NUM_BACKENDS: 3 2010-08-02 10:36:47 ERROR: pid 16699: read_kind_from_backend: 2 th kind D does not match with master or majority connection kind ^@ 2010-08-02 10:36:47 ERROR: pid 16699: kind mismatch among backends. Possible last query was: select * from carts where cart_id in (12979,12984,12987,12986,12982,12981) kind details are: 2[D] 2010-08-02 10:36:47 LOG: pid 16699: notice_backend_error: 2 fail over request from pid 16699 2010-08-02 10:36:47 DEBUG: pid 24933: failover_handler called Second node fell out at 1:56pm: 2010-08-02 13:56:21 DEBUG: pid 19720: AsciiRow: len: 1 data: t 2010-08-02 13:56:21 DEBUG: pid 19720: read_kind_from_backend: read kind from 0 th backend ^@ NUM_BACKENDS: 3 2010-08-02 13:56:21 DEBUG: pid 19720: read_kind_from_backend: read kind from 1 th backend D NUM_BACKENDS: 3 2010-08-02 13:56:21 ERROR: pid 19720: read_kind_from_backend: 1 th kind D does not match with master or majority connection kind ^@ 2010-08-02 13:56:21 ERROR: pid 19720: kind mismatch among backends. Possible last query was: select * from carts where cart_id in (13019,13018) kind details are: 1[D] 2010-08-02 13:56:21 LOG: pid 19720: notice_backend_error: 1 fail over request from pid 19720 2010-08-02 13:56:21 DEBUG: pid 24933: failover_handler called I notice two things, first that we died both times on a query of the carts table, but this could just be a coincidence. Second, we have the ^@ showing up as a kind again, except while two nodes returned ^@ when the first node fell out, only one returned ^@ when the second fell out, even though it was one of the nodes that returned ^@ that morning. I have PgPool 2.3.2 and PostgreSQL 8.4.4 on CentOS 5.5. Is this really a PgPool issue, or should I walk over to the postgresql mailing lists? Seems like pgpool issue. What I am wondering is this line: 2010-08-02 13:56:21 DEBUG: pid 19720: AsciiRow: len: 1 data: t This suggests that your DB application uses very old PostgreSQL protocol(call version 2 protocol), implemented in PostgreSQL 7.3 or before. But you said your PostgreSQL is 8.4.4. Is there anything special with your DB application? -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp Is there anything special with your DB application? Ya, to put it nicely, it's old as dirt thanks to about a decade of bad decisions by people who shouldn't be allowed to decide what to have for lunch. Sorry, I forgot to add the primary user of that table is a custom ordering system that currently runs Python 1.5 using _pg that was most likely built against PostgreSQL before 7.2 I can try to see if we can move it to using something built against 8.3 (or 8.4 if I can convince it to install on RHEL2.1) though, maybe in a week or two. Is it possible that that is confusing pgpool? ___ Pgpool-general mailing list Pgpool-general@pgfoundry.org http://pgfoundry.org/mailman/listinfo/pgpool-general
Re: [Pgpool-general] Order of Query Results
Seems like pgpool issue. What I am wondering is this line: 2010-08-02 13:56:21 DEBUG: pid 19720: AsciiRow: len: 1 data: t This suggests that your DB application uses very old PostgreSQL protocol(call version 2 protocol), implemented in PostgreSQL 7.3 or before. But you said your PostgreSQL is 8.4.4. Is there anything special with your DB application? -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp Is there anything special with your DB application? Ya, to put it nicely, it's old as dirt thanks to about a decade of bad decisions by people who shouldn't be allowed to decide what to have for lunch. Sorry, I forgot to add the primary user of that table is a custom ordering system that currently runs Python 1.5 using _pg that was most likely built against PostgreSQL before 7.2 I can try to see if we can move it to using something built against 8.3 (or 8.4 if I can convince it to install on RHEL2.1) though, maybe in a week or two. Is it possible that that is confusing pgpool? I think it's possible. Version 2 protocol is hard to manage, fragile and the code for the protcol in pgpool-II is complex. Using 8.3 or 8.4 (whatever version 7.4 or later) will make pgpool-II more robust and efficient. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp ___ Pgpool-general mailing list Pgpool-general@pgfoundry.org http://pgfoundry.org/mailman/listinfo/pgpool-general
Re: [Pgpool-general] Order of Query Results
Ok I have some more information. The cluster fell apart again yesterday. Again, there was nothing out of the ordinary in the postgresql logs leading upto the nodes falling out but this was logged in the pgpool log First node fell out at 10:36am: 2010-08-02 10:36:47 DEBUG: pid 16699: AsciiRow: 24 th field size does not match between master(16777216) and 2 th backend(0) 2010-08-02 10:36:47 DEBUG: pid 16699: read_kind_from_backend: read kind from 0 th backend ^@ NUM_BACKENDS: 3 2010-08-02 10:36:47 DEBUG: pid 16699: read_kind_from_backend: read kind from 1 th backend ^@ NUM_BACKENDS: 3 2010-08-02 10:36:47 DEBUG: pid 16699: read_kind_from_backend: read kind from 2 th backend D NUM_BACKENDS: 3 2010-08-02 10:36:47 ERROR: pid 16699: read_kind_from_backend: 2 th kind D does not match with master or majority connection kind ^@ 2010-08-02 10:36:47 ERROR: pid 16699: kind mismatch among backends. Possible last query was: select * from carts where cart_id in (12979,12984,12987,12986,12982,12981) kind details are: 2[D] 2010-08-02 10:36:47 LOG: pid 16699: notice_backend_error: 2 fail over request from pid 16699 2010-08-02 10:36:47 DEBUG: pid 24933: failover_handler called Second node fell out at 1:56pm: 2010-08-02 13:56:21 DEBUG: pid 19720: AsciiRow: len: 1 data: t 2010-08-02 13:56:21 DEBUG: pid 19720: read_kind_from_backend: read kind from 0 th backend ^@ NUM_BACKENDS: 3 2010-08-02 13:56:21 DEBUG: pid 19720: read_kind_from_backend: read kind from 1 th backend D NUM_BACKENDS: 3 2010-08-02 13:56:21 ERROR: pid 19720: read_kind_from_backend: 1 th kind D does not match with master or majority connection kind ^@ 2010-08-02 13:56:21 ERROR: pid 19720: kind mismatch among backends. Possible last query was: select * from carts where cart_id in (13019,13018) kind details are: 1[D] 2010-08-02 13:56:21 LOG: pid 19720: notice_backend_error: 1 fail over request from pid 19720 2010-08-02 13:56:21 DEBUG: pid 24933: failover_handler called I notice two things, first that we died both times on a query of the carts table, but this could just be a coincidence. Second, we have the ^@ showing up as a kind again, except while two nodes returned ^@ when the first node fell out, only one returned ^@ when the second fell out, even though it was one of the nodes that returned ^@ that morning. I have PgPool 2.3.2 and PostgreSQL 8.4.4 on CentOS 5.5. Is this really a PgPool issue, or should I walk over to the postgresql mailing lists? Seems like pgpool issue. What I am wondering is this line: 2010-08-02 13:56:21 DEBUG: pid 19720: AsciiRow: len: 1 data: t This suggests that your DB application uses very old PostgreSQL protocol(call version 2 protocol), implemented in PostgreSQL 7.3 or before. But you said your PostgreSQL is 8.4.4. Is there anything special with your DB application? -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp ___ Pgpool-general mailing list Pgpool-general@pgfoundry.org http://pgfoundry.org/mailman/listinfo/pgpool-general
Re: [Pgpool-general] Order of Query Results
On Wednesday, July 28, 2010 07:08:42 pm you wrote: Le 28/07/2010 17:16, Sean Brown a écrit : On Wednesday, July 28, 2010 10:56:27 am you wrote: Le 28/07/2010 16:50, Sean Brown a écrit : Does PgPool have an issue with the order of results from a query being returned in a different order? With the ongoing issue of our cluster falling out, we just had one member fall out again and again, no error in PostgreSQLs log is recorded. PgPool reports the possible last query was: select * from carts where cart_id in (11835,11824,11819) Specifically the error in PgPools log is: read_kind_from_backend: 1 th kind D does not match with master or majority connection kind ^@ 2010-07-28 10:21:23 ERROR: pid 28242: kind mismatch among backends. Possible last query was: select * from carts where cart_id in (11835,11824,11819) kind details are: 1[D] If I run that query on the remaining node and on the one that just fell out, I get the same 3 results, but the orders of the records are different. The query you show doesn't ask for a specific order (no ORDER BY clause), so each backend can send the data in whatever order they prefer. That I understand, what I am wondering is if this will cause PgPool to believe there is an error in the information returned from the backends, even if the only difference is the difference in order? If it does, I assume the best way to deal with it is add order by's to every query passed to pgpool? That's actually really good questions, and I don't have an answer to them. CC back the list to know if other people have an answer to give us. No, pgpool does not care about the order of data returned by SELECT. More details... PostgreSQL returns data packet like this: T(descriptions about tuple) D(actual one raw tuple data) D D C(indicates succeeded in sending data) where each single capital letter is packet kind(see PostgreSQL docs for more details). What pgpool actually does is, checking the packet kind, not content of tuple data. So as long as same number of tuples are returned from each backend, pgpool is happy. I'm not sure what pgpool version Sean's uses, it seems it's a little bit old (from the error message I guess). Also, majority connection kind ^@ looks strange. ^@ = 0x00, which is not valid kind at all. It seems something unusual is going on... If self cantained test case is provided, I will be able to look into this. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp Ok I have some more information. The cluster fell apart again yesterday. Again, there was nothing out of the ordinary in the postgresql logs leading upto the nodes falling out but this was logged in the pgpool log First node fell out at 10:36am: 2010-08-02 10:36:47 DEBUG: pid 16699: AsciiRow: 24 th field size does not match between master(16777216) and 2 th backend(0) 2010-08-02 10:36:47 DEBUG: pid 16699: read_kind_from_backend: read kind from 0 th backend ^@ NUM_BACKENDS: 3 2010-08-02 10:36:47 DEBUG: pid 16699: read_kind_from_backend: read kind from 1 th backend ^@ NUM_BACKENDS: 3 2010-08-02 10:36:47 DEBUG: pid 16699: read_kind_from_backend: read kind from 2 th backend D NUM_BACKENDS: 3 2010-08-02 10:36:47 ERROR: pid 16699: read_kind_from_backend: 2 th kind D does not match with master or majority connection kind ^@ 2010-08-02 10:36:47 ERROR: pid 16699: kind mismatch among backends. Possible last query was: select * from carts where cart_id in (12979,12984,12987,12986,12982,12981) kind details are: 2[D] 2010-08-02 10:36:47 LOG: pid 16699: notice_backend_error: 2 fail over request from pid 16699 2010-08-02 10:36:47 DEBUG: pid 24933: failover_handler called Second node fell out at 1:56pm: 2010-08-02 13:56:21 DEBUG: pid 19720: AsciiRow: len: 1 data: t 2010-08-02 13:56:21 DEBUG: pid 19720: read_kind_from_backend: read kind from 0 th backend ^@ NUM_BACKENDS: 3 2010-08-02 13:56:21 DEBUG: pid 19720: read_kind_from_backend: read kind from 1 th backend D NUM_BACKENDS: 3 2010-08-02 13:56:21 ERROR: pid 19720: read_kind_from_backend: 1 th kind D does not match with master or majority connection kind ^@ 2010-08-02 13:56:21 ERROR: pid 19720: kind mismatch among backends. Possible last query was: select * from carts where cart_id in (13019,13018) kind details are: 1[D] 2010-08-02 13:56:21 LOG: pid 19720: notice_backend_error: 1 fail over request from pid 19720 2010-08-02 13:56:21 DEBUG: pid 24933: failover_handler called I notice two things, first that we died both times on a query of the carts table, but this could just be a coincidence. Second, we have the ^@ showing up as a kind again, except while two nodes returned ^@ when the first node fell out, only one returned ^@ when the second fell out, even though it was one of the nodes that returned ^@ that morning. I have PgPool 2.3.2 and PostgreSQL 8.4.4
Re: [Pgpool-general] Order of Query Results
-Original Message- From: Tatsuo Ishii [mailto:is...@sraoss.co.jp] Sent: Wed 7/28/2010 7:08 PM To: guilla...@lelarge.info Cc: Sean Brown; pgpool-general@pgfoundry.org Subject: Re: [Pgpool-general] Order of Query Results Le 28/07/2010 17:16, Sean Brown a écrit : On Wednesday, July 28, 2010 10:56:27 am you wrote: Le 28/07/2010 16:50, Sean Brown a écrit : Does PgPool have an issue with the order of results from a query being returned in a different order? With the ongoing issue of our cluster falling out, we just had one member fall out again and again, no error in PostgreSQLs log is recorded. PgPool reports the possible last query was: select * from carts where cart_id in (11835,11824,11819) Specifically the error in PgPools log is: read_kind_from_backend: 1 th kind D does not match with master or majority connection kind ^@ 2010-07-28 10:21:23 ERROR: pid 28242: kind mismatch among backends. Possible last query was: select * from carts where cart_id in (11835,11824,11819) kind details are: 1[D] If I run that query on the remaining node and on the one that just fell out, I get the same 3 results, but the orders of the records are different. The query you show doesn't ask for a specific order (no ORDER BY clause), so each backend can send the data in whatever order they prefer. That I understand, what I am wondering is if this will cause PgPool to believe there is an error in the information returned from the backends, even if the only difference is the difference in order? If it does, I assume the best way to deal with it is add order by's to every query passed to pgpool? That's actually really good questions, and I don't have an answer to them. CC back the list to know if other people have an answer to give us. No, pgpool does not care about the order of data returned by SELECT. More details... PostgreSQL returns data packet like this: T(descriptions about tuple) D(actual one raw tuple data) D D : : C(indicates succeeded in sending data) where each single capital letter is packet kind(see PostgreSQL docs for more details). What pgpool actually does is, checking the packet kind, not content of tuple data. So as long as same number of tuples are returned from each backend, pgpool is happy. I'm not sure what pgpool version Sean's uses, it seems it's a little bit old (from the error message I guess). Also, majority connection kind ^@ looks strange. ^@ = 0x00, which is not valid kind at all. It seems something unusual is going on... If self cantained test case is provided, I will be able to look into this. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp Host is CentOS 5.5, PostgreSQL 8.4.4 and PgPool is 2.3.3 And just in case it matters, The ordering system we use that is the primary user of our database is using a very old Python (1.5) and the old pg connector. The test case is sort of an issue. We haven't been able to actually narrow down any one thing that causes our problems. Usually, there are no errors printed anywhere around the time where a node falls out. I had turned child_life_time from 120 to 0 wondering if PgPool was killing a child that had a persistent connection open (It seems FexEx shipping tool is kind of stupid in that way) and pgpool held it together for a few days longer then usual, but eventually it still fell apart. Maybe the next time it dies (probably tomorrow) I'll have more log information. ___ Pgpool-general mailing list Pgpool-general@pgfoundry.org http://pgfoundry.org/mailman/listinfo/pgpool-general
Re: [Pgpool-general] Order of Query Results
Le 28/07/2010 16:50, Sean Brown a écrit : Does PgPool have an issue with the order of results from a query being returned in a different order? With the ongoing issue of our cluster falling out, we just had one member fall out again and again, no error in PostgreSQLs log is recorded. PgPool reports the possible last query was: select * from carts where cart_id in (11835,11824,11819) Specifically the error in PgPools log is: read_kind_from_backend: 1 th kind D does not match with master or majority connection kind ^@ 2010-07-28 10:21:23 ERROR: pid 28242: kind mismatch among backends. Possible last query was: select * from carts where cart_id in (11835,11824,11819) kind details are: 1[D] If I run that query on the remaining node and on the one that just fell out, I get the same 3 results, but the orders of the records are different. The query you show doesn't ask for a specific order (no ORDER BY clause), so each backend can send the data in whatever order they prefer. -- Guillaume http://www.postgresql.fr http://dalibo.com ___ Pgpool-general mailing list Pgpool-general@pgfoundry.org http://pgfoundry.org/mailman/listinfo/pgpool-general
Re: [Pgpool-general] Order of Query Results
Le 28/07/2010 17:16, Sean Brown a écrit : On Wednesday, July 28, 2010 10:56:27 am you wrote: Le 28/07/2010 16:50, Sean Brown a écrit : Does PgPool have an issue with the order of results from a query being returned in a different order? With the ongoing issue of our cluster falling out, we just had one member fall out again and again, no error in PostgreSQLs log is recorded. PgPool reports the possible last query was: select * from carts where cart_id in (11835,11824,11819) Specifically the error in PgPools log is: read_kind_from_backend: 1 th kind D does not match with master or majority connection kind ^@ 2010-07-28 10:21:23 ERROR: pid 28242: kind mismatch among backends. Possible last query was: select * from carts where cart_id in (11835,11824,11819) kind details are: 1[D] If I run that query on the remaining node and on the one that just fell out, I get the same 3 results, but the orders of the records are different. The query you show doesn't ask for a specific order (no ORDER BY clause), so each backend can send the data in whatever order they prefer. That I understand, what I am wondering is if this will cause PgPool to believe there is an error in the information returned from the backends, even if the only difference is the difference in order? If it does, I assume the best way to deal with it is add order by's to every query passed to pgpool? That's actually really good questions, and I don't have an answer to them. CC back the list to know if other people have an answer to give us. -- Guillaume http://www.postgresql.fr http://dalibo.com ___ Pgpool-general mailing list Pgpool-general@pgfoundry.org http://pgfoundry.org/mailman/listinfo/pgpool-general
Re: [Pgpool-general] Order of Query Results
On Wednesday, July 28, 2010 10:56:27 am Guillaume Lelarge wrote: Le 28/07/2010 16:50, Sean Brown a écrit : Does PgPool have an issue with the order of results from a query being returned in a different order? With the ongoing issue of our cluster falling out, we just had one member fall out again and again, no error in PostgreSQLs log is recorded. PgPool reports the possible last query was: select * from carts where cart_id in (11835,11824,11819) Specifically the error in PgPools log is: read_kind_from_backend: 1 th kind D does not match with master or majority connection kind ^@ 2010-07-28 10:21:23 ERROR: pid 28242: kind mismatch among backends. Possible last query was: select * from carts where cart_id in (11835,11824,11819) kind details are: 1[D] If I run that query on the remaining node and on the one that just fell out, I get the same 3 results, but the orders of the records are different. The query you show doesn't ask for a specific order (no ORDER BY clause), so each backend can send the data in whatever order they prefer. Making sure to respond to the mailing list this time. That I understand, what I am wondering is if this will cause PgPool to believe there is an error in the information returned from the backends, even if the only difference is the difference in order? If it does, I assume the best way to deal with it is add order by's to every query passed to pgpool? ___ Pgpool-general mailing list Pgpool-general@pgfoundry.org http://pgfoundry.org/mailman/listinfo/pgpool-general
Re: [Pgpool-general] Order of Query Results
Le 28/07/2010 17:16, Sean Brown a écrit : On Wednesday, July 28, 2010 10:56:27 am you wrote: Le 28/07/2010 16:50, Sean Brown a écrit : Does PgPool have an issue with the order of results from a query being returned in a different order? With the ongoing issue of our cluster falling out, we just had one member fall out again and again, no error in PostgreSQLs log is recorded. PgPool reports the possible last query was: select * from carts where cart_id in (11835,11824,11819) Specifically the error in PgPools log is: read_kind_from_backend: 1 th kind D does not match with master or majority connection kind ^@ 2010-07-28 10:21:23 ERROR: pid 28242: kind mismatch among backends. Possible last query was: select * from carts where cart_id in (11835,11824,11819) kind details are: 1[D] If I run that query on the remaining node and on the one that just fell out, I get the same 3 results, but the orders of the records are different. The query you show doesn't ask for a specific order (no ORDER BY clause), so each backend can send the data in whatever order they prefer. That I understand, what I am wondering is if this will cause PgPool to believe there is an error in the information returned from the backends, even if the only difference is the difference in order? If it does, I assume the best way to deal with it is add order by's to every query passed to pgpool? That's actually really good questions, and I don't have an answer to them. CC back the list to know if other people have an answer to give us. No, pgpool does not care about the order of data returned by SELECT. More details... PostgreSQL returns data packet like this: T(descriptions about tuple) D(actual one raw tuple data) D D : : C(indicates succeeded in sending data) where each single capital letter is packet kind(see PostgreSQL docs for more details). What pgpool actually does is, checking the packet kind, not content of tuple data. So as long as same number of tuples are returned from each backend, pgpool is happy. I'm not sure what pgpool version Sean's uses, it seems it's a little bit old (from the error message I guess). Also, majority connection kind ^@ looks strange. ^@ = 0x00, which is not valid kind at all. It seems something unusual is going on... If self cantained test case is provided, I will be able to look into this. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp ___ Pgpool-general mailing list Pgpool-general@pgfoundry.org http://pgfoundry.org/mailman/listinfo/pgpool-general