Sorry, forgot to mention the most obvious and important information :
My postgres is 8.4.2
On Sep 24, 2012, at 13:33, Kiriakos Tsourapas wrote:
> Hi,
>
> The problem : Postgres is becoming slow, day after day, and only a full
> vacuum fixes the problem.
>
> Information you may need to evaluate :
>
> The problem lies on all tables and queries, as far as I can tell, but we can
> focus on a single table for better comprehension.
>
> The queries I am running to test the speed are :
> INSERT INTO "AWAITINGSTATUSSMPP" VALUES('143428', '1111', 1, '2012-06-16
> 13:39:19', '111');
> DELETE FROM "AWAITINGSTATUSSMPP" WHERE "SMSCMSGID" = '1111' AND
> "CONNECTIONID" = 1;
> SELECT * FROM "AWAITINGSTATUSSMPP" WHERE "SMSCMSGID" = '1111' AND
> "CONNECTIONID" = 1;
>
> After a full vacuum, they run in about 100ms.
> Today, before the full vacuum, they were taking around 500ms.
>
> Below is an explain analyze of the commands AFTER a full vacuum. I did not
> run it before, so I can not post relevant info before the vacuum. So, after
> the full vacuum :
>
> explain analyze INSERT INTO "AWAITINGSTATUSSMPP" VALUES('143428', '1111', 1,
> '2012-06-16 13:39:19', '111');
> "Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.002 rows=1
> loops=1)"
> "Trigger for constraint FK_AWAITINGSTATUSSMPP_MESSAGES: time=0.131 calls=1"
> "Trigger bucardo_add_delta: time=0.454 calls=1"
> "Trigger bucardo_triggerkick_MassSMs: time=0.032 calls=1"
> "Total runtime: 0.818 ms"
>
> explain analyze DELETE FROM "AWAITINGSTATUSSMPP" WHERE "SMSCMSGID" = '1111'
> AND "CONNECTIONID" = 1;"Seq Scan on "AWAITINGSTATUSSMPP" (cost=0.00..2.29
> rows=1 width=6) (actual time=0.035..0.035 rows=0 loops=1)"
> " Filter: ((("SMSCMSGID")::text = '1111'::text) AND ("CONNECTIONID" = 1))"
> "Trigger bucardo_triggerkick_MassSMs: time=0.066 calls=1"
> "Total runtime: 0.146 ms"
>
> explain analyze SELECT * FROM "AWAITINGSTATUSSMPP" WHERE "SMSCMSGID" = '1111'
> AND "CONNECTIONID" = 1;
> "Seq Scan on "AWAITINGSTATUSSMPP" (cost=0.00..2.29 rows=1 width=557) (actual
> time=0.028..0.028 rows=0 loops=1)"
> " Filter: ((("SMSCMSGID")::text = '1111'::text) AND ("CONNECTIONID" = 1))"
> "Total runtime: 0.053 ms"
>
> Below are the metadata of the table :
> =====================================
> CREATE TABLE "AWAITINGSTATUSSMPP"
> (
> "MESSAGEID" bigint NOT NULL,
> "SMSCMSGID" character varying(50) NOT NULL,
> "CONNECTIONID" smallint NOT NULL,
> "EXPIRE_TIME" timestamp without time zone NOT NULL,
> "RECIPIENT" character varying(20) NOT NULL,
> "CLIENT_MSG_ID" character varying(255),
> CONSTRAINT "PK_AWAITINGSTATUSSMPP" PRIMARY KEY ("SMSCMSGID",
> "CONNECTIONID"),
> CONSTRAINT "FK_AWAITINGSTATUSSMPP_MESSAGES" FOREIGN KEY ("MESSAGEID")
> REFERENCES "MESSAGES" ("ID") MATCH SIMPLE
> ON UPDATE NO ACTION ON DELETE CASCADE
> )
> WITH (
> OIDS=FALSE
> );
> ALTER TABLE "AWAITINGSTATUSSMPP" OWNER TO postgres;
> GRANT ALL ON TABLE "AWAITINGSTATUSSMPP" TO "MassSMsUsers";
>
> CREATE INDEX "IX_AWAITINGSTATUSSMPP_MSGID_RCP"
> ON "AWAITINGSTATUSSMPP"
> USING btree
> ("MESSAGEID", "RECIPIENT");
>
> CREATE TRIGGER bucardo_add_delta
> AFTER INSERT OR UPDATE OR DELETE
> ON "AWAITINGSTATUSSMPP"
> FOR EACH ROW
> EXECUTE PROCEDURE bucardo."bucardo_add_delta_SMSCMSGID|CONNECTIONID"();
>
> CREATE TRIGGER "bucardo_triggerkick_MassSMs"
> AFTER INSERT OR UPDATE OR DELETE OR TRUNCATE
> ON "AWAITINGSTATUSSMPP"
> FOR EACH STATEMENT
> EXECUTE PROCEDURE bucardo."bucardo_triggerkick_MassSMs"();
> =====================================
>
> The table only has about 200 records because it is being used a temporary
> storage and records are constantly inserted and deleted.
> BUT please don't get hold on this fact, because as I already said, the speed
> problem is not restricted to this table. The same problems appear on the
> following query
> UPDATE "MESSAGES" SET "SENT" = "SENT" + 1 WHERE "ID" = 143447;
> and MESSAGES table has mainly inserts and few deletes...
>
> My postgresql.conf file :
> ======================
> port = 5433 # (change requires restart)
> max_connections = 100 # (change requires restart)
> shared_buffers = 256MB # min 128kB. DoubleIP - Default was
> 32MB
> synchronous_commit = off # immediate fsync at commit. DoubleIP
> - Default was on
> effective_cache_size = 512MB # DoubleIP - Default was 128MB
> log_destination = 'stderr' # Valid values are combinations of
> logging_collector = on # Enable capturing of stderr and
> csvlog
> silent_mode = on # Run server silently.
> log_line_prefix = '%t %d %u ' # special values:
> log_autovacuum_min_duration = 0 # -1 disables, 0 logs all actions and
> autovacuum_naptime = 28800 # time between autovacuum runs.
> DoubleIP - default was 1min
> autovacuum_vacuum_threshold = 100 # min number of row updates before
> autovacuum_vacuum_scale_factor = 0.0 # fraction of table size before
> vacuum. DoubleIP - default was 0.2
> datestyle = 'iso, mdy'
> lc_messages = 'en_US.UTF-8' # locale for system error
> message
> lc_monetary = 'en_US.UTF-8' # locale for monetary
> formatting
> lc_numeric = 'en_US.UTF-8' # locale for number formatting
> lc_time = 'en_US.UTF-8' # locale for time formatting
> default_text_search_config = 'pg_catalog.english'
> =======================
>
> As you will see, I have altered the shared_buffers and synchronous_commit
> values.
> The shared_buffers had the default value 32Mb. When I changed it to 256Mb the
> problem still appears but it takes more time to appear (3-4 days). With 32MB,
> it appeared faster, probably after 24 hours.
> Also, I have changed the autovacuum daemon to work every 8 hours but I
> changed its values to make sure it vacuums pretty much all tables (the ones
> for which at least 100 rows have changed).
> Please note, though, that my problem existed even before playing around with
> the autovacuum. This is why I tried to change its values in the first place.
>
> The server is synchronized with another server using bucardo. Bucardo process
> is running on the other server.
> The same problem appears on the 2nd server too... after 3-4 days, postgres is
> running slower and slower.
>
> Our server configuration :
> DELL PowerEdge T610 Tower Chassis for Up to 8x 3.5" HDDs
> 2x Intel Xeon E5520 Processor (2.26GHz, 8M Cache, 5.86 GT/s QPI, Turbo, HT),
> 1066MHz Max Memory
> 8GB Memory,1333MHz
> 2 x 146GB SAS 15k 3.5" HD Hot Plug
> 6 x 1TB SATA 7.2k 3.5" Additional HD Hot Plug
> PERC 6/i RAID Controller Card 256MB PCIe, 2x4 Connectors
> SUSE Linux Enterprise Server 10, SP2
>
> The 2 HDs are set up with RAID-1
> The 6 HDs are set up with RAID-5
>
> Linux is running on the RAID-1 configuration
> Postgres is running on the RAID-5 configuration
>
>
> Finally a top before and after the full vacuum :
> top - 11:27:44 up 72 days, 13:27, 37 users, load average: 1.05, 1.31, 1.45
> Tasks: 279 total, 3 running, 276 sleeping, 0 stopped, 0 zombie
> Cpu(s): 3.6%us, 0.8%sy, 0.0%ni, 95.5%id, 0.0%wa, 0.0%hi, 0.1%si, 0.0%st
> Mem: 8166432k total, 7963116k used, 203316k free, 115344k buffers
> Swap: 2097144k total, 2097020k used, 124k free, 2337636k cached
>
> top - 11:30:58 up 72 days, 13:31, 38 users, load average: 1.53, 1.59, 1.53
> Tasks: 267 total, 2 running, 265 sleeping, 0 stopped, 0 zombie
> Cpu(s): 1.3%us, 0.4%sy, 0.0%ni, 98.0%id, 0.3%wa, 0.0%hi, 0.1%si, 0.0%st
> Mem: 8166432k total, 6016268k used, 2150164k free, 61092k buffers
> Swap: 2097144k total, 2010204k used, 86940k free, 2262896k cached
>
>
> I hope I have provided enough info and hope that someone can point me to the
> correct direction.
>
>
> Thank you very much even for reading up to here !
>
> Best regards,
> Kiriakos