Re: [PERFORM] Optimizer showing wrong rows in plan

2010-03-29 Thread Nikolas Everett
See http://www.postgresql.org/docs/current/static/storage-page-layout.html for
all of what is taking up the space.  Short version:
 Per block overhead is > 24 bytes
 Per row overhead is 23 bytes + some alignment loss + the null bitmap if you
have nullable columns

On Mon, Mar 29, 2010 at 8:24 AM, raghavendra t wrote:

> Hi Mattew,
>
> Thank you for the information.
>
> Once again, I like to thank each and everyone in this thread for there
> ultimate support.
>
> Regards
> Raghavendra
>
> On Mon, Mar 29, 2010 at 4:47 PM, Matthew Wakeling wrote:
>
>> On Mon, 29 Mar 2010, Tadipathri Raghu wrote:
>>
>>> As per the documentation, one page is 8kb, when i create a table with int
>>> as
>>> one column its 4 bytes. If i insert 2000 rows, it should be in one page
>>> only
>>> as its 8kb, but its extending vastly as expected. Example shown below,
>>> taking the previous example table test with one column.
>>>
>>
>> There is more to a row than just the single int column. The space used by
>> a column will include a column start marker (data length), transaction ids,
>> hint bits, an oid, a description of the types of the columns, and finally
>> your data columns. That takes a bit more space.
>>
>> Matthew
>>
>> --
>> If you let your happiness depend upon how somebody else feels about you,
>> now you have to control how somebody else feels about you. -- Abraham
>> Hicks
>>
>> --
>> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org
>> )
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-performance
>>
>
>


Re: [PERFORM] Optimizer showing wrong rows in plan

2010-03-29 Thread raghavendra t
Hi Mattew,

Thank you for the information.

Once again, I like to thank each and everyone in this thread for there
ultimate support.

Regards
Raghavendra

On Mon, Mar 29, 2010 at 4:47 PM, Matthew Wakeling wrote:

> On Mon, 29 Mar 2010, Tadipathri Raghu wrote:
>
>> As per the documentation, one page is 8kb, when i create a table with int
>> as
>> one column its 4 bytes. If i insert 2000 rows, it should be in one page
>> only
>> as its 8kb, but its extending vastly as expected. Example shown below,
>> taking the previous example table test with one column.
>>
>
> There is more to a row than just the single int column. The space used by a
> column will include a column start marker (data length), transaction ids,
> hint bits, an oid, a description of the types of the columns, and finally
> your data columns. That takes a bit more space.
>
> Matthew
>
> --
> If you let your happiness depend upon how somebody else feels about you,
> now you have to control how somebody else feels about you. -- Abraham Hicks
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


Re: [PERFORM] Optimizer showing wrong rows in plan

2010-03-29 Thread Matthew Wakeling

On Mon, 29 Mar 2010, Tadipathri Raghu wrote:

As per the documentation, one page is 8kb, when i create a table with int as
one column its 4 bytes. If i insert 2000 rows, it should be in one page only
as its 8kb, but its extending vastly as expected. Example shown below,
taking the previous example table test with one column.


There is more to a row than just the single int column. The space used by 
a column will include a column start marker (data length), transaction 
ids, hint bits, an oid, a description of the types of the columns, and 
finally your data columns. That takes a bit more space.


Matthew

--
If you let your happiness depend upon how somebody else feels about you,
now you have to control how somebody else feels about you. -- Abraham Hicks

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimizer showing wrong rows in plan

2010-03-28 Thread Tadipathri Raghu
Hi Tom,

Thank for the update.


> IIRC, it will set the relpages/reltuples counts (though not any
> more-complex statistics); but only if the table is found to not be
> completely empty.  Again, this is a behavior designed with common
> usage patterns in mind, to not set relpages/reltuples to zero on a
> table that's likely to get populated shortly.
>
As Harris, asked about creation of index will update the statistics. Yes
indexes are updating the statistics, so indexes will analyze the table on
the backend and update the statistics too, before it creating the index or
after creating the index.

Example
==
postgres=# create table test(id int);
CREATE TABLE
postgres=# insert into test  VALUES (1);
INSERT 0 1
postgres=# select relname,reltuples,relpages from pg_class where
relname='test';
 relname | reltuples | relpages
-+---+--
 test| 0 |0
(1 row)
postgres=# create INDEX itest on test (id);
CREATE INDEX
postgres=# select relname,reltuples,relpages from pg_class where
relname='test';
 relname | reltuples | relpages
-+---+--
 test| 1 |1
(1 row)

Adding one more thing to this thread
==
As per the documentation, one page is 8kb, when i create a table with int as
one column its 4 bytes. If i insert 2000 rows, it should be in one page only
as its 8kb, but its extending vastly as expected. Example shown below,
taking the previous example table test with one column.

postgres=# insert into test VALUES (generate_series(2,2000));
INSERT 0 1999
postgres=# \dt+
List of relations
  Schema  | Name | Type  |  Owner   | Size  | Description
--+--+---+--+---+-
 edbstore | test | table | postgres | 64 kB |
(1 row)
postgres=# select count(*) from test ;
 count
---
  2000
(1 row)

Why the its extending so many pages, where it can fit in one page. Is there
any particular reason in behaving this type of paging.

Thanks for all in advance

Regards
Raghavendra



On Sun, Mar 28, 2010 at 11:07 PM, Tom Lane  wrote:

> Jeremy Harris  writes:
> > On 03/28/2010 05:27 PM, Tom Lane wrote:
> >> This is intentional: the size estimates for a never-yet-analyzed
> >> table are *not* zero.  This is because people frequently create and load
> >> up a table and then immediately query it without an explicit ANALYZE.
>
> > Does the creation of an index also populate statistics?
>
> IIRC, it will set the relpages/reltuples counts (though not any
> more-complex statistics); but only if the table is found to not be
> completely empty.  Again, this is a behavior designed with common
> usage patterns in mind, to not set relpages/reltuples to zero on a
> table that's likely to get populated shortly.
>
>regards, tom lane
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


Re: [PERFORM] Optimizer showing wrong rows in plan

2010-03-28 Thread Tom Lane
Jeremy Harris  writes:
> On 03/28/2010 05:27 PM, Tom Lane wrote:
>> This is intentional: the size estimates for a never-yet-analyzed
>> table are *not* zero.  This is because people frequently create and load
>> up a table and then immediately query it without an explicit ANALYZE.

> Does the creation of an index also populate statistics?

IIRC, it will set the relpages/reltuples counts (though not any
more-complex statistics); but only if the table is found to not be
completely empty.  Again, this is a behavior designed with common
usage patterns in mind, to not set relpages/reltuples to zero on a
table that's likely to get populated shortly.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimizer showing wrong rows in plan

2010-03-28 Thread Jeremy Harris

On 03/28/2010 05:27 PM, Tom Lane wrote:

  This is intentional: the size estimates for a never-yet-analyzed
table are *not* zero.  This is because people frequently create and load
up a table and then immediately query it without an explicit ANALYZE.


Does the creation of an index also populate statistics?

Thanks,
Jeremy

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimizer showing wrong rows in plan

2010-03-28 Thread Tom Lane
Tadipathri Raghu  writes:
> Frankly speaking its a newly created table without any operation on it as
> you have seen the example. Then how come it showing those many rows where we
> have only one in it.

Yes.  This is intentional: the size estimates for a never-yet-analyzed
table are *not* zero.  This is because people frequently create and load
up a table and then immediately query it without an explicit ANALYZE.
The quality of the plans you'd get at that point (especially for joins)
would be spectacularly bad if the default assumption were that the table
was very small.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimizer showing wrong rows in plan

2010-03-28 Thread Gary Doades

On 28/03/2010 10:07 AM, Tadipathri Raghu wrote:

Hi All,
I want to give some more light on this by analysing more like this
1. In my example I have created a table with one column as INT( which 
occupies 4 bytes)

2. Initially it occupies one page of  space on the file that is (8kb).
So, here is it assuming these many rows may fit in this page. Clarify 
me on this Please.


Like I said, it's just a guess. With no statistics all postgres can do 
is guess, or in this case use the in-built default for a newly created 
table. It could guess 1 or it could guess 10,000,000. What it does is 
produce a reasonable guess in the absence of any other information.


You should read the postgres documentation for further information about 
statistics and how the optimizer uses them.



Regards,
Gary.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimizer showing wrong rows in plan

2010-03-28 Thread Frank Heikens


Op 28 mrt 2010, om 11:07 heeft Tadipathri Raghu het volgende geschreven:


Hi All,

I want to give some more light on this by analysing more like this

1. In my example I have created a table with one column as  
INT( which occupies 4 bytes)

2. Initially it occupies one page of  space on the file that is (8kb).

So, here is it assuming these many rows may fit in this page.  
Clarify me on this Please.


See these chapters in the manual: 
http://www.postgresql.org/docs/8.4/interactive/storage.html

The minimum size of a file depends on the block size, by default 8kb: 
http://www.postgresql.org/docs/8.4/interactive/install-procedure.html

Regards,
Frank




Regards
Raghavendra


On Sun, Mar 28, 2010 at 2:06 PM, Gary Doades  wrote:
On 28/03/2010 8:33 AM, Tadipathri Raghu wrote:


Hi Guz,

It is assuming that there are 2400 rows in this table. Probably  
you've deleted some rows from the table leaving just one.


Frankly speaking its a newly created table without any operation on  
it as you have seen the example. Then how come it showing those  
many rows where we have only one in it.

Thanks if we have proper explination on this..
It's not *showing* any rows at all, it's *guessing* 2400 rows  
because you've never analyzed the table. Without any statistics at  
all, postgres will use some form of in-built guess for a table that  
produces reasonable plans under average conditions. As you've  
already seen, once you analyze the table, the guess get's much   
better and therefore would give you a more appropriate plan.


Regards,
Gary.









Re: [PERFORM] Optimizer showing wrong rows in plan

2010-03-28 Thread Tadipathri Raghu
Hi All,

I want to give some more light on this by analysing more like this

1. In my example I have created a table with one column as INT( which
occupies 4 bytes)
2. Initially it occupies one page of  space on the file that is (8kb).

So, here is it assuming these many rows may fit in this page. Clarify me on
this Please.

Regards
Raghavendra


On Sun, Mar 28, 2010 at 2:06 PM, Gary Doades  wrote:

>  On 28/03/2010 8:33 AM, Tadipathri Raghu wrote:
>
> Hi Guz,
>
>
>> It is assuming that there are 2400 rows in this table. Probably you've
>> deleted some rows from the table leaving just one.
>
>
> Frankly speaking its a newly created table without any operation on it as
> you have seen the example. Then how come it showing those many rows where we
> have only one in it.
> Thanks if we have proper explination on this..
>
> It's not *showing* any rows at all, it's *guessing* 2400 rows because
> you've never analyzed the table. Without any statistics at all, postgres
> will use some form of in-built guess for a table that produces reasonable
> plans under average conditions. As you've already seen, once you analyze the
> table, the guess get's much  better and therefore would give you a more
> appropriate plan.
>
> Regards,
> Gary.
>
>


Re: [PERFORM] Optimizer showing wrong rows in plan

2010-03-28 Thread Tadipathri Raghu
Hi Guz,


> It is assuming that there are 2400 rows in this table. Probably you've
> deleted some rows from the table leaving just one.


Frankly speaking its a newly created table without any operation on it as
you have seen the example. Then how come it showing those many rows where we
have only one in it.
Thanks if we have proper explination on this..

Regards
Raghavendra


On Sun, Mar 28, 2010 at 12:59 PM, Szymon Guz  wrote:

>
>
>  2010/3/28 Tadipathri Raghu 
>
>> Hi Guz,
>>
>> Thank you for the prompt reply.
>>
>>
>>> No, the optimizer is not retrieving anything, it just assumes that there
>>> are 2400 rows because that is the number of rows that exists in the
>>> statictics for this table. The optimizer just tries to find the best plan
>>> and to optimize the query plan for execution taking into consideration all
>>> information that can be found for this table (it also looks in the
>>> statistics information about rows from this table).
>>
>>
>> So, whats it assuming here as rows(2400).  Could you explain this.
>>
>>
>>
> It is assuming that there are 2400 rows in this table. Probably you've
> deleted some rows from the table leaving just one.
>
> regards
> Szymon Guz
>
>


Re: [PERFORM] Optimizer showing wrong rows in plan

2010-03-28 Thread Szymon Guz
2010/3/28 Tadipathri Raghu 

> Hi Guz,
>
> Thank you for the prompt reply.
>
>
>> No, the optimizer is not retrieving anything, it just assumes that there
>> are 2400 rows because that is the number of rows that exists in the
>> statictics for this table. The optimizer just tries to find the best plan
>> and to optimize the query plan for execution taking into consideration all
>> information that can be found for this table (it also looks in the
>> statistics information about rows from this table).
>
>
> So, whats it assuming here as rows(2400).  Could you explain this.
>
>
>
It is assuming that there are 2400 rows in this table. Probably you've
deleted some rows from the table leaving just one.

regards
Szymon Guz


Re: [PERFORM] Optimizer showing wrong rows in plan

2010-03-28 Thread Tadipathri Raghu
Hi Guz,

Thank you for the prompt reply.


> No, the optimizer is not retrieving anything, it just assumes that there
> are 2400 rows because that is the number of rows that exists in the
> statictics for this table. The optimizer just tries to find the best plan
> and to optimize the query plan for execution taking into consideration all
> information that can be found for this table (it also looks in the
> statistics information about rows from this table).


So, whats it assuming here as rows(2400).  Could you explain this.

Regards
Raghavendra
On Sun, Mar 28, 2010 at 12:32 PM, Szymon Guz  wrote:

> 2010/3/28 Tadipathri Raghu 
>
>  Hi All,
>>
>> Example on optimizer
>> ===
>> postgres=# create table test(id int);
>> CREATE TABLE
>> postgres=# insert into test VALUES (1);
>> INSERT 0 1
>> postgres=# select * from test;
>>  id
>> 
>>   1
>> (1 row)
>> postgres=# explain select * from test;
>>QUERY PLAN
>> 
>>  Seq Scan on test  (cost=0.00..34.00 *rows=2400* width=4)
>> (1 row)
>> In the above, example the optimizer is retreiving those many rows where
>> there is only one row in that table. If i analyze am geting one row.
>>
>
> No, the optimizer is not retrieving anything, it just assumes that there
> are 2400 rows because that is the number of rows that exists in the
> statictics for this table. The optimizer just tries to find the best plan
> and to optimize the query plan for execution taking into consideration all
> information that can be found for this table (it also looks in the
> statistics information about rows from this table).
>
>
>>
>> postgres=# ANALYZE test;
>> ANALYZE
>> postgres=# explain select * from test;
>>  QUERY PLAN
>> 
>>  Seq Scan on test  (cost=0.00..1.01 *rows=1* width=4)
>> (1 row)
>>
>> My question here is, what it retreiving as rows when there is no such. One
>> more thing, if i wont do analyze and run the explain plan for three or more
>> times, then catalogs getting updated automatically and resulting the correct
>> row as 1.
>>
>>
>
> Now ANALYZE changed the statistics for this table and now the planner knows
> that there is just one row. In the background there can work autovacuum so
> it changes rows automatically (the autovacuum work characteristic depends on
> the settings for the database).
>
>
>> Q2. Does explain , will update the catalogs automatically.
>>
>>
>
> No, explain doesn't update table's statistics.
>
>
> regards
> Szymon Guz
>


Re: [PERFORM] Optimizer showing wrong rows in plan

2010-03-28 Thread Szymon Guz
2010/3/28 Tadipathri Raghu 

> Hi All,
>
> Example on optimizer
> ===
> postgres=# create table test(id int);
> CREATE TABLE
> postgres=# insert into test VALUES (1);
> INSERT 0 1
> postgres=# select * from test;
>  id
> 
>   1
> (1 row)
> postgres=# explain select * from test;
>QUERY PLAN
> 
>  Seq Scan on test  (cost=0.00..34.00 *rows=2400* width=4)
> (1 row)
> In the above, example the optimizer is retreiving those many rows where
> there is only one row in that table. If i analyze am geting one row.
>

No, the optimizer is not retrieving anything, it just assumes that there are
2400 rows because that is the number of rows that exists in the statictics
for this table. The optimizer just tries to find the best plan and to
optimize the query plan for execution taking into consideration all
information that can be found for this table (it also looks in the
statistics information about rows from this table).


>
> postgres=# ANALYZE test;
> ANALYZE
> postgres=# explain select * from test;
>  QUERY PLAN
> 
>  Seq Scan on test  (cost=0.00..1.01 *rows=1* width=4)
> (1 row)
>
> My question here is, what it retreiving as rows when there is no such. One
> more thing, if i wont do analyze and run the explain plan for three or more
> times, then catalogs getting updated automatically and resulting the correct
> row as 1.
>
>

Now ANALYZE changed the statistics for this table and now the planner knows
that there is just one row. In the background there can work autovacuum so
it changes rows automatically (the autovacuum work characteristic depends on
the settings for the database).


> Q2. Does explain , will update the catalogs automatically.
>
>

No, explain doesn't update table's statistics.


regards
Szymon Guz