Hi Marcelo, I did 3 days continuous stress test using pgbench before RC1 or 2 and got nothing wrong. So the possibility is,
1) I need more load average. 2) I need more complex queries other than just SELECT/UPDATE/INSERT. Thoughts? -- Tatsuo Ishii SRA OSS, Inc. Japan > Hi James, > > Unfortunately I have also had several issues with backends getting > mismatch problems on 2.2 while on load. Had to go back to 2.1 . > > I was doing a stresstest on our pgpool server which has two backends. > One of our devs created a python script that replays the apache logs > and each script is run on 4 boxes that open 150 cocurrent connection > to pgpool. > > When the script first start everything seems ok. But when the amount > of transactions start to increase the load on the pg backends start to > get high around 20-30 avg and thats when pgpool starts to throw a > bunch of mismatch errors and the backebds fall out sync. Sometimes > that happens in 5 minutes other time in 10-15 minutes. > > So we decided to do the stresstest gain but against version2.1 with a > patch for the DECLARE statements. It was a cvs version from 2008-08-25 > if I'm correct. Everything worked out great on version 2.1, we > tresstested pgpool and the backends to the fullest and no problems at > all. We let the test run for 1 hour and repeated about 3 times. The > load avg on the pg backends also reached around 50-70 . > > When I get some time, hopefully next week or so, I will start doing > this same tests but increasing the pgpool cvs version after revision > 112 until I start seeing problems again. Hopefully that will help > Tatsuo. > > > - > Marcelo > > On Mar 6, 2009, at 2:44, Jaume Sabater <[email protected]> wrote: > > > Hi all! > > > > Just tried to connect to my pgpool-II 2.2/PostgreSQL 8.3 cluster and > > saw an error, which I forgot to copy and paste somewhere, that said > > something like "error in catalog with relid 26243" (I only copied the > > number). I checked the cluster and, again, there had been a kind > > mismatch among backends, so the slave node was down and the cluster > > was working only with the master node. > > > > This is what I found on the syslog: > > > > Mar 6 08:10:27 pgsql1 pgpool: ERROR: pid 26306: > > read_kind_from_backend: 1 th kind E does not match with master or > > majority connection kind C > > Mar 6 08:10:27 pgsql1 pgpool: ERROR: pid 26306: kind mismatch among > > backends. Possible last query was: "COPY "TSearcherServices" > > ("IdSearcherServices", "IdSearcher" ,"IdService", "SearcherNumber" ) > > Mar 6 08:10:27 pgsql1 pgpool: FROM '/opt/pgpool2/ > > TSearcherServices.csv' > > Mar 6 08:10:27 pgsql1 pgpool: WITH DELIMITER AS '|' CSV;" kind > > details are: 0[C] 1[E] > > Mar 6 08:10:27 pgsql1 pgpool: LOG: pid 26306: notice_backend_error: > > 1 fail over request from pid 26306 > > Mar 6 08:10:27 pgsql1 pgpool: LOG: pid 5315: starting degeneration. > > shutdown host pgsql2.freyatest.domain(5432) > > Mar 6 08:10:27 pgsql1 pgpool: LOG: pid 5315: execute command: > > /var/lib/postgresql/8.3/main/pgpool-failover 1 pgsql2.freyatest.domain > > 5432 /var/lib/postgresql/8.3/main 0 0 > > Mar 6 08:10:27 pgsql1 pgpool[32211]: Executing pgpool-failover as > > user postgres > > Mar 6 08:10:27 pgsql1 pgpool[32212]: Failover of node 1 at hostname > > pgsql2.freyatest.domain. New master node is 0. Old master node was 0. > > Mar 6 08:10:27 pgsql1 pgpool: LOG: pid 5315: failover_handler: set > > new master node: 0 > > Mar 6 08:10:27 pgsql1 pgpool: LOG: pid 5315: failover done. > > shutdown host pgsql2.freyatest.domain(5432) > > > > > > These COPY operations have been very frequent during the last three or > > four months, with developers constantly dumping information here and > > there. With version 2.1 I never had a mismatch among backends, but now > > I have had 2 of those this very same week, plus a few more the > > previous couple of weeks (we were working with betas or RCs of version > > 2.2). I can't really point at version 2.2 regarding the issue, but I > > promise I don't recall it happening with version 2.1. It is true that > > the number of operations on the PostgreSQL cluster have increased a > > lot in the last 4 weeks, too. > > > > Tatsuo, could you please check it out? Here you are the other error > > that happened this week. Notice the query was different. Logs from > > past week are gone, unfortunately. > > > > Mar 5 14:50:26 pgsql1 pgpool: ERROR: pid 20120: pool_read: read > > failed (Connection reset by peer) > > Mar 5 14:50:26 pgsql1 pgpool: LOG: pid 20120: > > ProcessFrontendResponse: failed to read kind from frontend. frontend > > abnormally exited > > Mar 5 14:50:26 pgsql1 pgpool: LOG: pid 20120: > > read_kind_from_backend: parameter name: is_superuser value: on > > Mar 5 14:50:26 pgsql1 pgpool: LOG: pid 20120: > > read_kind_from_backend: parameter name: session_authorization value: > > pgpool2 > > Mar 5 14:50:26 pgsql1 pgpool: LOG: pid 20120: > > read_kind_from_backend: parameter name: is_superuser value: on > > Mar 5 14:50:26 pgsql1 pgpool: LOG: pid 20120: > > read_kind_from_backend: parameter name: session_authorization value: > > pgpool2 > > Mar 5 14:50:57 pgsql1 pgpool: LOG: pid 19950: > > read_kind_from_backend: parameter name: is_superuser value: on > > Mar 5 14:50:57 pgsql1 pgpool: LOG: pid 19950: > > read_kind_from_backend: parameter name: session_authorization value: > > pgpool2 > > Mar 5 14:50:57 pgsql1 pgpool: LOG: pid 19950: > > read_kind_from_backend: parameter name: is_superuser value: on > > Mar 5 14:50:57 pgsql1 pgpool: LOG: pid 19950: > > read_kind_from_backend: parameter name: session_authorization value: > > pgpool2 > > Mar 5 14:51:08 pgsql1 pgpool: ERROR: pid 19538: > > read_kind_from_backend: 1 th kind E does not match with master or > > majority connection kind C > > Mar 5 14:51:08 pgsql1 pgpool: ERROR: pid 19538: kind mismatch among > > backends. Possible last query was: "delete from "TSearcher"" kind > > details are: 0[C] 1[E] > > Mar 5 14:51:08 pgsql1 pgpool: LOG: pid 19538: notice_backend_error: > > 1 fail over request from pid 19538 > > Mar 5 14:51:08 pgsql1 pgpool: LOG: pid 5315: starting degeneration. > > shutdown host pgsql2.freyatest.domain(5432) > > Mar 5 14:51:08 pgsql1 pgpool: LOG: pid 5315: execute command: > > /var/lib/postgresql/8.3/main/pgpool-failover 1 pgsql2.freyatest.domain > > 5432 /var/lib/postgresql/8.3/main 0 0 > > Mar 5 14:51:08 pgsql1 pgpool[20704]: Executing pgpool-failover as > > user postgres > > Mar 5 14:51:08 pgsql1 pgpool[20705]: Failover of node 1 at hostname > > pgsql2.freyatest.domain. New master node is 0. Old master node was 0. > > Mar 5 14:51:08 pgsql1 pgpool: LOG: pid 5315: failover_handler: set > > new master node: 0 > > Mar 5 14:51:08 pgsql1 pgpool: LOG: pid 5315: failover done. > > shutdown host pgsql2.freyatest.domain(5432) > > > > > > Anyone else having this problem? > > > > -- > > Jaume Sabater > > http://linuxsilo.net/ > > > > "Ubi sapientas ibi libertas" > > _______________________________________________ > > Pgpool-general mailing list > > [email protected] > > http://pgfoundry.org/mailman/listinfo/pgpool-general > _______________________________________________ > Pgpool-general mailing list > [email protected] > http://pgfoundry.org/mailman/listinfo/pgpool-general _______________________________________________ Pgpool-general mailing list [email protected] http://pgfoundry.org/mailman/listinfo/pgpool-general
