Re: [GENERAL] explain plan visibility
高健 wrote: > Please let me dig it a little more: > > I think that when a session is accessing a postgersql table. It will be > influenced by the > followings: > > > > Really old data (needed to be vacuumed, eg: old image at one day ago). > > Recent data (committed and uncommitted), because they are all in the data > block. > > > > Isn’t it strange that I have to access my data among somebody’s un-decided > data? > > How if there is a busy system having a table accessed by so many sessions at > same time? > > They will all be slowed down because of uncommitted but flushed out data, I > think. > > > > I hope in future the architecture of PostgreSQL can put the committed data & > uncommitted data apart, > > Or even put them in separate physical disks.That will Help to improve > performance I think. I don't think that this design choice will ever be changed. You'll always be influenced by other people's data, no matter how you handle MVCC. Let's imagine a design where yet uncommitted data are stored somewhere else. How do you handle the following problems: 1) The space for uncommitted data runs out. That will keep you from doing any transaction exceeding the size of this storage space. Note that Oracle suffers from that problem (though they do the opposite of what you propose: they store uncommitted data in the table and move old data to the "undo space"). 2) Every COMMIT will require that everything is moved from the "uncommited space" to the regular table. So you'd double the I/O cost for the normal case (I guess that's the reason why Oracle does it the other way round). And you wouldn't even get rid of the most unpleasant side effect of the way that PostgreSQL manages MVCC, the need to VACUUM away old tuples. As for Oracle's approach to MVCC, which I think is better than yours, it has its pros and cons, but I don't think it is demonstrably better than PostgreSQL's. No matter how you implement MVCC, you'll have to pay the price for the benefits somehow. Yours, Laurenz Albe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] explain plan visibility
Hi Laurenz: Thank you for your kind reply. Please let me dig it a little more: I think that when a session is accessing a postgersql table. It will be influenced by the followings: Really old data (needed to be vacuumed, eg: old image at one day ago). Recent data (committed and uncommitted), because they are all in the data block. Isn’t it strange that I have to access my data among somebody’s un-decided data? How if there is a busy system having a table accessed by so many sessions at same time? They will all be slowed down because of uncommitted but flushed out data, I think. I hope in future the architecture of PostgreSQL can put the committed data & uncommitted data apart, Or even put them in separate physical disks.That will Help to improve performance I think. Jian Gao 2012/11/9 Albe Laurenz > 高健 wrote: > > I have one question about the visibility of explain plan. > > > > Firstly , I was inserting into data to a table. I use : [ insert > into ptest select * from > > test02; ] > > > > And test02 table has 10,000,000 records. And ptest is a parent table, > which has two distribution > > child table --- ctest01 and ctest02. > > > > When I execute the above sql statement, it takes some time to execute > because of data volume. > > > > Before the above sql statement finish, I open another session with > psql, and execute: [ select > > count(*) from ptest; ] > > Because the insert into statement in other session has not finished, I > got the result of zero. > > > > Before first session finish, If I check the explain of select, I got: > > postgres=# explain select count(*) from ptest; > > QUERY PLAN > > > --- > > Aggregate (cost=55406.40..55406.41 rows=1 width=0) > >-> Append (cost=0.00..49601.92 rows=2321793 width=0) > > -> Seq Scan on ptest (cost=0.00..0.00 rows=1 width=0) > > -> Seq Scan on ctest01 ptest (cost=0.00..24776.52 > rows=1159752 width=0) > > -> Seq Scan on ctest02 ptest (cost=0.00..24825.40 > rows=1162040 width=0) > > (5 rows)postgres=# > > > > I think that is because postgresql is holding commited and uncommited > data together > > physically(vacuum is needed because of it?). > > > > Is there some method that allow simple select such as select count(*) > do not activate the explain > > plan ? > > (I think the more table is bigger, the more sample data is needed) > > "SELECT count(*) FROM ptest" may be simple, but it is expensive > because it has to visit every tuple in the table. > The first time you run it after the insert it might also trigger > considerable write activity (hint bits), but that on the side. > > If you need only an estimate, try > SELECT sum(reltuples) FROM pg_class WHERE relname IN ('ptest', 'ctest01', > 'ctest02'); > > Yours, > Laurenz Albe >
Re: [GENERAL] explain plan visibility
高健 wrote: > I have one question about the visibility of explain plan. > > Firstly , I was inserting into data to a table. I use : [ insert into > ptest select * from > test02; ] > > And test02 table has 10,000,000 records. And ptest is a parent table, which > has two distribution > child table --- ctest01 and ctest02. > > When I execute the above sql statement, it takes some time to execute > because of data volume. > > Before the above sql statement finish, I open another session with psql, and > execute: [ select > count(*) from ptest; ] > Because the insert into statement in other session has not finished, I got > the result of zero. > > Before first session finish, If I check the explain of select, I got: > postgres=# explain select count(*) from ptest; > QUERY PLAN > --- > Aggregate (cost=55406.40..55406.41 rows=1 width=0) >-> Append (cost=0.00..49601.92 rows=2321793 width=0) > -> Seq Scan on ptest (cost=0.00..0.00 rows=1 width=0) > -> Seq Scan on ctest01 ptest (cost=0.00..24776.52 rows=1159752 > width=0) > -> Seq Scan on ctest02 ptest (cost=0.00..24825.40 rows=1162040 > width=0) > (5 rows)postgres=# > > I think that is because postgresql is holding commited and uncommited data > together > physically(vacuum is needed because of it?). > > Is there some method that allow simple select such as select count(*) do > not activate the explain > plan ? > (I think the more table is bigger, the more sample data is needed) "SELECT count(*) FROM ptest" may be simple, but it is expensive because it has to visit every tuple in the table. The first time you run it after the insert it might also trigger considerable write activity (hint bits), but that on the side. If you need only an estimate, try SELECT sum(reltuples) FROM pg_class WHERE relname IN ('ptest', 'ctest01', 'ctest02'); Yours, Laurenz Albe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general