date:20130515

[BUGS] WAL partition ran out of space, pg_subtrans & pg_clog partially written...

2013-05-15 Thread Sean Chittenden

This is an FYI re: a bug I ran across.

Background: RHEL 6 & ext4. PGDATA, a table space, and WAL logs are all on their 
own partitions.

The WAL partition filled up (wal_keep_segments was changed, but Pg hadn't been 
restarted), and a write happened, and it appears to have resulted in an index 
page being corrupt. REINDEX'ing the table didn't work with the following error:

> WARNING: concurrent delete in progress within table "tblA"

> WARNING: concurrent delete in progress within table "tblA"
> ERROR: could not access status of transaction 86081816
> DETAIL: Could not read from file "pg_subtrans/0521" at offset 131072: Success.

pg_subtrans/0521 was a 90112 byte file and pg_clog/0052 was 24576 bytes in size.

I haven't dug in to the code, but I did see the commit message for 
pgsql/src/backend/access/transam/slru.c 1.23.4.2, which I'm guessing is 
related. My WAG is that there's an assumption someplace that pg_subtrans and 
pg_clog are on the same partition as pg_xlog, or that creation of files in 
pg_subtrans and pg_clog will either absolutely succeed or absolutely fail. It 
could also be that Linux reported back a successful write(2), but it didn't 
actually have the space available (ext4).

Anyway, after extending pg_subtrans/0521 w/ zeros to its proper 256KB size, I 
was able to REINDEX the table, but there was a stream of WARNINGs about 
"concurrent inserts and deletes" that I didn't dig in to. Upon learning the WAL 
files were removed as a temporary solution to the space problem, I opted to 
dump, re-initdb, and load the data, which worked without any errors or warnings 
being reported.

I've saved the data if there are pointed questions about their contents.

-sc

--
Sean Chittenden
s...@chittenden.org



-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #8163: simultaneous nearly identical update queries execute extremely slowly

2013-05-15 Thread Tom Lane

"Todd A. Cook"  writes:
> On 05/15/13 16:10, Tom Lane wrote:
>> Given the reference to EvalPlanQual in your stack trace, I'm thinking
>> the explanation is this 9.0 fix:

> Thanks for the explanation.  Is there any chance of that fix being backpatched
> into 8.4?

None whatsoever.  Aside from the mentioned change in semantics, there
were numerous followup fixes, which would be difficult to disentangle
from everything else that changed in 9.0.  We would not risk
destabilizing 8.4 like that.

8.4 is barely more a year from EOL anyway, so you really ought to be
thinking about an upgrade.

regards, tom lane

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #8163: simultaneous nearly identical update queries execute extremely slowly

2013-05-15 Thread Todd A. Cook


On 05/15/13 16:10, Tom Lane wrote:

"Todd A. Cook"  writes:

On 05/15/13 13:27, tc...@blackducksoftware.com wrote:

When nearly identical update queries arrive simultaneously, the first one to
execute runs normally, but subsequent executions run _extremely_ slowly.
We've seen this behaviour in production, and the contrived test case below
reproduces the issue.



I've repeated the test below on a 9.1.9 installation, and it works fine there.


Given the reference to EvalPlanQual in your stack trace, I'm thinking
the explanation is this 9.0 fix:


Thanks for the explanation.  Is there any chance of that fix being backpatched
into 8.4?

-- todd



Author: Tom Lane 
Branch: master Release: REL9_0_BR [9f2ee8f28] 2009-10-26 02:26:45 +

 Re-implement EvalPlanQual processing to improve its performance and 
eliminate
 a lot of strange behaviors that occurred in join cases.  We now identify 
the
 "current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
 UPDATE/SHARE queries.  If an EvalPlanQual recheck is necessary, we jam the
 appropriate row into each scan node in the rechecking plan, forcing it to 
emit
 only that one row.  The former behavior could rescan the whole of each 
joined
 relation for each recheck, which was terrible for performance, and what's 
much
 worse could result in duplicated output tuples.

 Also, the original implementation of EvalPlanQual could not re-use the 
recheck
 execution tree --- it had to go through a full executor init and shutdown 
for
 every row to be tested.  To avoid this overhead, I've associated a special
 runtime Param with each LockRows or ModifyTable plan node, and arranged to
 make every scan node below such a node depend on that Param.  Thus, by
 signaling a change in that Param, the EPQ machinery can just rescan the
 already-built test plan.

 This patch also adds a prohibition on set-returning functions in the
 targetlist of SELECT FOR UPDATE/SHARE.  This is needed to avoid the
 duplicate-output-tuple problem.  It seems fairly reasonable since the
 other restrictions on SELECT FOR UPDATE are meant to ensure that there
 is a unique correspondence between source tuples and result tuples,
 which an output SRF destroys as much as anything else does.

regards, tom lane






--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #8163: simultaneous nearly identical update queries execute extremely slowly

2013-05-15 Thread Tom Lane

"Todd A. Cook"  writes:
> On 05/15/13 13:27, tc...@blackducksoftware.com wrote:
>> When nearly identical update queries arrive simultaneously, the first one to
>> execute runs normally, but subsequent executions run _extremely_ slowly.
>> We've seen this behaviour in production, and the contrived test case below
>> reproduces the issue.

> I've repeated the test below on a 9.1.9 installation, and it works fine there.

Given the reference to EvalPlanQual in your stack trace, I'm thinking
the explanation is this 9.0 fix:

Author: Tom Lane 
Branch: master Release: REL9_0_BR [9f2ee8f28] 2009-10-26 02:26:45 +

Re-implement EvalPlanQual processing to improve its performance and 
eliminate
a lot of strange behaviors that occurred in join cases.  We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries.  If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to 
emit
only that one row.  The former behavior could rescan the whole of each 
joined
relation for each recheck, which was terrible for performance, and what's 
much
worse could result in duplicated output tuples.

Also, the original implementation of EvalPlanQual could not re-use the 
recheck
execution tree --- it had to go through a full executor init and shutdown 
for
every row to be tested.  To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param.  Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.

This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE.  This is needed to avoid the
duplicate-output-tuple problem.  It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.

regards, tom lane

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #8163: simultaneous nearly identical update queries execute extremely slowly

2013-05-15 Thread Todd A. Cook


On 05/15/13 13:27, tc...@blackducksoftware.com wrote:

The following bug has been logged on the website:

Bug reference:  8163
Logged by:  Todd Cook
Email address:  tc...@blackducksoftware.com
PostgreSQL version: 8.4.16
Operating system:   Fedora 14
Description:

When nearly identical update queries arrive simultaneously, the first one to
execute runs normally, but subsequent executions run _extremely_ slowly.
We've seen this behaviour in production, and the contrived test case below
reproduces the issue.


I've repeated the test below on a 9.1.9 installation, and it works fine there.
Each update finished in about 7 seconds.

-- todd



test=> select version() ;
   version


  PostgreSQL 8.4.16 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC)
4.4.4 20100726 (Red Hat 4.4.4-13), 64-bit


To set up:

create table prof as select i as id, i::text col1, (i*2)::text col2 ,
(i*3)::text col3, i*2 col4, md5((i % 5999)::text) as hash, (i % 6000)::text
as hint, (i*4)::text col6, i*5 col7, i*6 col8 from
generate_series(1,3600) i ;
create table tree as select 'fixed16charstrng'::text as changeme, md5((i %
20)::text) as hash from generate_series(1,40) i ;
create index tree_hash_idx on tree(hash) ;

The problematic query run in isolation:

explain analyze update tree set changeme = 'changed' where hash in (select
hash from prof where hint = '2500') ;
   QUERY PLAN

--
  Nested Loop  (cost=198.75..1000104.44 rows=11583 width=39) (actual
time=6765.316..6871.167 rows=11998 loops=1)
->  HashAggregate  (cost=198.75..198.76 rows=1 width=33) (actual
time=6765.264..6768.259 rows=5999 loops=1)
  ->  Seq Scan on prof  (cost=0.00..184.15 rows=5840 width=33)
(actual time=1.351..6755.691 rows=6000 loops=1)
Filter: (hint = '2500'::text)
->  Index Scan using tree_hash_idx on tree  (cost=0.00..5.65 rows=2
width=39) (actual time=0.014..0.016 rows=2 loops=5999)
  Index Cond: (tree.hash = prof.hash)
  Total runtime: 7132.700 ms
(7 rows)

To exercise the problem (assuming a database named "test"):
psql -c "update tree set changeme = 'changed' where hash in (select hash
from prof where hint = '2500')" test &
psql -c "update tree set changeme = 'changed' where hash in (select hash
from prof where hint = '2501')" test &
psql -c "update tree set changeme = 'changed' where hash in (select hash
from prof where hint = '2502')" test &
psql -c "update tree set changeme = 'changed' where hash in (select hash
from prof where hint = '2503')" test &
psql -c "update tree set changeme = 'changed' where hash in (select hash
from prof where hint = '2504')" test &
psql -c "update tree set changeme = 'changed' where hash in (select hash
from prof where hint = '2505')" test &
psql -c "update tree set changeme = 'changed' where hash in (select hash
from prof where hint = '2506')" test &

One of the update begins executing immediately, while the others block
waiting on the first (which is expected). The first update finished in under
10 seconds, and another one started executing; however, this second one has
now been executing for 2 hours.

strace output from that backend is almost exclusively reads, with only a few
calls to lseek.  Attaching with gdb and interrupting a few times mostly gave
this backtrace:

#0  0x003b812d3490 in __read_nocancel () from /lib64/libc.so.6
#1  0x005cd0cd in FileRead ()
#2  0x005dc55d in mdread ()
#3  0x005ca315 in ReadBuffer_common ()
#4  0x005cac7f in ReadBufferExtended ()
#5  0x00460c8b in heapgetpage ()
#6  0x0046110a in heapgettup_pagemode ()
#7  0x00461b56 in heap_getnext ()
#8  0x0054ef18 in SeqNext ()
#9  0x005429ba in ExecScan ()
#10 0x0053b8a8 in ExecProcNode ()
#11 0x00547ac8 in ExecAgg ()
#12 0x0053b7b8 in ExecProcNode ()
#13 0x0054e031 in ExecNestLoop ()
#14 0x0053b818 in ExecProcNode ()
#15 0x0053827e in EvalPlanQualNext ()
#16 0x0053867b in EvalPlanQual ()
#17 0x00539afd in standard_ExecutorRun ()
#18 0x7f796347881b in pgss_ExecutorRun (queryDesc=0x1af53b0,
direction=ForwardScanDirection, count=0) at pg_stat_statements.c:516
#19 0x005e3ad1 in ProcessQuery ()
#20 0x005e3cd4 in PortalRunMulti ()
#21 0x005e4452 in PortalRun ()
#22 0x005dfac9 in exec_simple_query ()
#23 0x005e10f3 in PostgresMain ()
#24 0x005b4e73 in ServerLoop ()
#25 0x005b729e in PostmasterMain ()
#26 0x00562578 in main ()







--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your sub

[BUGS] BUG #8163: simultaneous nearly identical update queries execute extremely slowly

2013-05-15 Thread tcook

The following bug has been logged on the website:

Bug reference:  8163
Logged by:  Todd Cook
Email address:  tc...@blackducksoftware.com
PostgreSQL version: 8.4.16
Operating system:   Fedora 14
Description:

When nearly identical update queries arrive simultaneously, the first one to
execute runs normally, but subsequent executions run _extremely_ slowly. 
We've seen this behaviour in production, and the contrived test case below
reproduces the issue.

test=> select version() ;
  version   
   

 PostgreSQL 8.4.16 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC)
4.4.4 20100726 (Red Hat 4.4.4-13), 64-bit


To set up:

create table prof as select i as id, i::text col1, (i*2)::text col2 ,
(i*3)::text col3, i*2 col4, md5((i % 5999)::text) as hash, (i % 6000)::text
as hint, (i*4)::text col6, i*5 col7, i*6 col8 from
generate_series(1,3600) i ;
create table tree as select 'fixed16charstrng'::text as changeme, md5((i %
20)::text) as hash from generate_series(1,40) i ;
create index tree_hash_idx on tree(hash) ;

The problematic query run in isolation:

explain analyze update tree set changeme = 'changed' where hash in (select
hash from prof where hint = '2500') ;
  QUERY PLAN
 
--
 Nested Loop  (cost=198.75..1000104.44 rows=11583 width=39) (actual
time=6765.316..6871.167 rows=11998 loops=1)
   ->  HashAggregate  (cost=198.75..198.76 rows=1 width=33) (actual
time=6765.264..6768.259 rows=5999 loops=1)
 ->  Seq Scan on prof  (cost=0.00..184.15 rows=5840 width=33)
(actual time=1.351..6755.691 rows=6000 loops=1)
   Filter: (hint = '2500'::text)
   ->  Index Scan using tree_hash_idx on tree  (cost=0.00..5.65 rows=2
width=39) (actual time=0.014..0.016 rows=2 loops=5999)
 Index Cond: (tree.hash = prof.hash)
 Total runtime: 7132.700 ms
(7 rows)

To exercise the problem (assuming a database named "test"):
psql -c "update tree set changeme = 'changed' where hash in (select hash
from prof where hint = '2500')" test &
psql -c "update tree set changeme = 'changed' where hash in (select hash
from prof where hint = '2501')" test &
psql -c "update tree set changeme = 'changed' where hash in (select hash
from prof where hint = '2502')" test &
psql -c "update tree set changeme = 'changed' where hash in (select hash
from prof where hint = '2503')" test &
psql -c "update tree set changeme = 'changed' where hash in (select hash
from prof where hint = '2504')" test &
psql -c "update tree set changeme = 'changed' where hash in (select hash
from prof where hint = '2505')" test &
psql -c "update tree set changeme = 'changed' where hash in (select hash
from prof where hint = '2506')" test &

One of the update begins executing immediately, while the others block
waiting on the first (which is expected). The first update finished in under
10 seconds, and another one started executing; however, this second one has
now been executing for 2 hours.

strace output from that backend is almost exclusively reads, with only a few
calls to lseek.  Attaching with gdb and interrupting a few times mostly gave
this backtrace:

#0  0x003b812d3490 in __read_nocancel () from /lib64/libc.so.6
#1  0x005cd0cd in FileRead ()
#2  0x005dc55d in mdread ()
#3  0x005ca315 in ReadBuffer_common ()
#4  0x005cac7f in ReadBufferExtended ()
#5  0x00460c8b in heapgetpage ()
#6  0x0046110a in heapgettup_pagemode ()
#7  0x00461b56 in heap_getnext ()
#8  0x0054ef18 in SeqNext ()
#9  0x005429ba in ExecScan ()
#10 0x0053b8a8 in ExecProcNode ()
#11 0x00547ac8 in ExecAgg ()
#12 0x0053b7b8 in ExecProcNode ()
#13 0x0054e031 in ExecNestLoop ()
#14 0x0053b818 in ExecProcNode ()
#15 0x0053827e in EvalPlanQualNext ()
#16 0x0053867b in EvalPlanQual ()
#17 0x00539afd in standard_ExecutorRun ()
#18 0x7f796347881b in pgss_ExecutorRun (queryDesc=0x1af53b0,
direction=ForwardScanDirection, count=0) at pg_stat_statements.c:516
#19 0x005e3ad1 in ProcessQuery ()
#20 0x005e3cd4 in PortalRunMulti ()
#21 0x005e4452 in PortalRun ()
#22 0x005dfac9 in exec_simple_query ()
#23 0x005e10f3 in PostgresMain ()
#24 0x005b4e73 in ServerLoop ()
#25 0x005b729e in PostmasterMain ()
#26 0x00562578 in main ()



-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #8160: 9.3 Beta 1 Initdb doesn't work

2013-05-15 Thread Devrim GÜNDÜZ


Hi again,

On Tue, 2013-05-14 at 13:48 +, bnichol...@hp.com wrote:

> I've installed the 9.3 beta 1 packages (via pgdg repo and yum) and when I
> try to do an initdb it fails. /var/lib/pgsql/9.3/data has nothing in it when
> I attempt the initdb :
> 
> #service postgresql-9.3 initdb
> Initializing database:[FAILED]



Fixed this issue with 9.3beta1-4 RPMS. Thanks for the report (and thanks
Heikki for finding the real issue)

Regards,
-- 
Devrim GÜNDÜZ
Principal Systems Engineer @ EnterpriseDB: http://www.enterprisedb.com
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
http://www.gunduz.org  Twitter: http://twitter.com/devrimgunduz


signature.asc
Description: This is a digitally signed message part

Re: [BUGS] BUG #8159: PostgreSQlk is not working

2013-05-15 Thread Sandeep Thakkar

Please send us the installation logs. They can be found in %TEMP% with
the name "install-postgresql.log"

On Tue, May 14, 2013 at 5:08 PM,   wrote:
> The following bug has been logged on the website:
>
> Bug reference:  8159
> Logged by:  Janos
> Email address:  csibe1985...@freemail.hu
> PostgreSQL version: 9.2.4
> Operating system:   Vista
> Description:
>
> Problem runninng post-install step.Installation may not complete correctly.
> The database cluster initialisation failed.
>
>
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs

-- 
Sandeep Thakkar
Senior Software Engineer
EnterpriseDB Corporation
The Postgres Database Company
Phone: +91.20.30589505

Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of
the individual or entity to whom it is addressed. This message
contains information from EnterpriseDB Corporation that may be
privileged, confidential, or exempt from disclosure under applicable
law. If you are not the intended recipient or authorized to receive
this for the intended recipient, any use, dissemination, distribution,
retention, archiving, or copying of this communication is strictly
prohibited. If you have received this e-mail in error, please notify
the sender immediately by reply e-mail and delete this message.

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

[BUGS] WAL partition ran out of space, pg_subtrans & pg_clog partially written...

Re: [BUGS] BUG #8163: simultaneous nearly identical update queries execute extremely slowly

Re: [BUGS] BUG #8163: simultaneous nearly identical update queries execute extremely slowly

Re: [BUGS] BUG #8163: simultaneous nearly identical update queries execute extremely slowly

Re: [BUGS] BUG #8163: simultaneous nearly identical update queries execute extremely slowly

[BUGS] BUG #8163: simultaneous nearly identical update queries execute extremely slowly

Re: [BUGS] BUG #8160: 9.3 Beta 1 Initdb doesn't work

Re: [BUGS] BUG #8159: PostgreSQlk is not working

8 matches

Site Navigation

Mail list logo

Footer information