Re: RES: RES: [PERFORM] select on 1milion register = 6s

2007-07-29 Thread Decibel!
Please reply-all so others can learn and contribute.

On Sun, Jul 29, 2007 at 09:38:12PM -0700, Craig James wrote:
> Decibel! wrote:
> >It's unlikely that it's going to be faster to index scan 2.3M rows than
> >to sequential scan them. Try setting enable_seqscan=false and see if it
> >is or not.
> 
> Out of curiosity ... Doesn't that depend on the table?  Are all of the data 
> for one row stored contiguously, or are the data stored column-wise?  If 
> it's the former, and the table has hundreds of columns, or a few columns 
> with large text strings, then wouldn't the time for a sequential scan 
> depend not on the number of rows, but rather the total amount of data?

Yes, the time for a seqscan is mostly dependent on table size and not
the number of rows. But the number of rows plays a very large role in
the cost of an indexscan.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828


pgpQf4XPs38J4.pgp
Description: PGP signature


Re: RES: RES: [PERFORM] select on 1milion register = 6s

2007-07-29 Thread Decibel!
On Sun, Jul 29, 2007 at 01:44:23PM -0300, Bruno Rodrigues Siqueira wrote:
> EXPLAIN
>  ANALYZE
> select 
>to_char(data_encerramento,'mm/')  as opcoes_mes, 
>to_char(data_encerramento,'-mm') as ordem from detalhamento_bas 
> where
> 
> data_encerramento >=  '01/12/2006'  and
> data_encerramento < '01/02/2007' 
> 
> GROUP BY opcoes_mes, ordem
> ORDER BY ordem DESC
> 
> 
> 
> QUERY PLAN
> Sort  (cost=219113.10..219113.10 rows=4 width=8) (actual
> time=10079.212..10079.213 rows=2 loops=1)
>   Sort Key: to_char(data_encerramento, '-mm'::text)
>   ->  HashAggregate  (cost=219113.09..219113.09 rows=4 width=8) (actual
> time=10079.193..10079.195 rows=2 loops=1)
> ->  Seq Scan on detalhamento_bas  (cost=0.00..217945.41 rows=2335358
> width=8) (actual time=0.041..8535.792 rows=2335819 loops=1)
>   Filter: ((data_encerramento >= '2006-12-01
> 00:00:00'::timestamp without time zone) AND (data_encerramento < '2007-02-01
> 00:00:00'::timestamp without time zone))
> Total runtime: 10079.256 ms
> 
> Strange!!! Why does the index not works?

It's unlikely that it's going to be faster to index scan 2.3M rows than
to sequential scan them. Try setting enable_seqscan=false and see if it
is or not.
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgp7DXcKrZWrf.pgp
Description: PGP signature


Re: [PERFORM] select on 1milion register = 6s

2007-07-29 Thread Alvaro Herrera
Scott Marlowe wrote:
> On 7/28/07, Bruno Rodrigues Siqueira <[EMAIL PROTECTED]> wrote:

> > stats_start_collector = off
> > #stats_command_string = off
> > #stats_block_level = off
> > #stats_row_level = off
> > #stats_reset_on_server_start = off
> 
> I think you need stats_row_level on for autovacuum, but I'm not 100% sure.

That's correct (of course you need "start_collector" on as well).  Most
likely, autovacuum is not even running.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


RES: RES: [PERFORM] select on 1milion register = 6s

2007-07-29 Thread Bruno Rodrigues Siqueira
Look it




EXPLAIN
 ANALYZE
select 
   to_char(data_encerramento,'mm/')  as opcoes_mes, 
   to_char(data_encerramento,'-mm') as ordem from detalhamento_bas 
where

data_encerramento  =  '01/12/2006'  

GROUP BY opcoes_mes, ordem
ORDER BY ordem DESC





QUERY PLAN
Sort  (cost=60.72..60.72 rows=1 width=8) (actual time=4.586..4.586 rows=0
loops=1)
  Sort Key: to_char(data_encerramento, '-mm'::text)
  ->  HashAggregate  (cost=60.72..60.72 rows=1 width=8) (actual
time=4.579..4.579 rows=0 loops=1)
->  Index Scan using detalhamento_bas_idx3005 on detalhamento_bas
(cost=0.00..60.67 rows=105 width=8) (actual time=4.576..4.576 rows=0
loops=1)
  Index Cond: (data_encerramento = '2006-12-01
00:00:00'::timestamp without time zone)
Total runtime: 4.629 ms




EXPLAIN
 ANALYZE
select 
   to_char(data_encerramento,'mm/')  as opcoes_mes, 
   to_char(data_encerramento,'-mm') as ordem from detalhamento_bas 
where

data_encerramento >=  '01/12/2006'  and
data_encerramento < '01/02/2007' 

GROUP BY opcoes_mes, ordem
ORDER BY ordem DESC



QUERY PLAN
Sort  (cost=219113.10..219113.10 rows=4 width=8) (actual
time=10079.212..10079.213 rows=2 loops=1)
  Sort Key: to_char(data_encerramento, '-mm'::text)
  ->  HashAggregate  (cost=219113.09..219113.09 rows=4 width=8) (actual
time=10079.193..10079.195 rows=2 loops=1)
->  Seq Scan on detalhamento_bas  (cost=0.00..217945.41 rows=2335358
width=8) (actual time=0.041..8535.792 rows=2335819 loops=1)
  Filter: ((data_encerramento >= '2006-12-01
00:00:00'::timestamp without time zone) AND (data_encerramento < '2007-02-01
00:00:00'::timestamp without time zone))
Total runtime: 10079.256 ms







Strange!!! Why does the index not works?

All my querys doesn't work with range dates I don't know what to do...
Please, help!  


Bruno



-Mensagem original-
De: Decibel! [mailto:[EMAIL PROTECTED] 
Enviada em: domingo, 29 de julho de 2007 13:36
Para: Ragnar
Cc: Bruno Rodrigues Siqueira; pgsql-performance@postgresql.org
Assunto: Re: RES: [PERFORM] select on 1milion register = 6s

On Sat, Jul 28, 2007 at 10:36:16PM +, Ragnar wrote:
> On lau, 2007-07-28 at 17:12 -0300, Bruno Rodrigues Siqueira wrote:
> 
> > where
> > 
> > to_char( data_encerramento ,'-mm') 
> > between   '2006-12' and  '2007-01'
> 
> assuming data_encerramento is a date column, try:
> WHERE data_encerramento between   '2006-12-01' and  '2007-01-31'

IMO, much better would be:

WHERE data_encerramento >= '2006-12-01' AND data_encerramento <
'2007-02-01'

This means you don't have to worry about last day of the month or
timestamp precision. In fact, since the field is actually a timestamp,
the between posted above won't work correctly.
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: RES: [PERFORM] select on 1milion register = 6s

2007-07-29 Thread Decibel!
On Sat, Jul 28, 2007 at 10:36:16PM +, Ragnar wrote:
> On lau, 2007-07-28 at 17:12 -0300, Bruno Rodrigues Siqueira wrote:
> 
> > where
> > 
> > to_char( data_encerramento ,'-mm') 
> > between   '2006-12' and  '2007-01'
> 
> assuming data_encerramento is a date column, try:
> WHERE data_encerramento between   '2006-12-01' and  '2007-01-31'

IMO, much better would be:

WHERE data_encerramento >= '2006-12-01' AND data_encerramento <
'2007-02-01'

This means you don't have to worry about last day of the month or
timestamp precision. In fact, since the field is actually a timestamp,
the between posted above won't work correctly.
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgp5eYJsdKDrX.pgp
Description: PGP signature


[PERFORM] Questions on Tags table schema

2007-07-29 Thread Jay Kang
Hello,

I'm currently trying to decide on a database design for tags in my web
2.0application. The problem I'm facing is that I have 3 separate
tables
i.e. cars, planes, and schools. All three tables need to interact with the
tags, so there will only be one universal set of tags for the three tables.

I read a lot about tags and the best articles I found were:

Road to Web 2.0 ( http://wyome.com/docs/Road_to_Web_2.0:_The_Database_Design )

tags: database schema (
http://www.pui.ch/phred/archives/2005/04/tags-database-schemas.html )

and a forum discussion on tags with a very similar problem:
http://www.webmasterworld.com/forum112/502.htm
But I don't like the solution, would like to stick with serial integer for
Cars, Planes and Schools tables.

Currently, this is my DB design:

Cars (carid, carname, text, etc.)
Planes (planeid, planename, text, etc.)
Schools (schoolname, text, etc.) <-- School does not take int as primary
key but a varchar.

Tags (tagid, tagname, etc)

--- Now here is where I have the question. I have to link up three separate
tables to use Tags
--- So when a new car is created in the Cars table, should I insert that
carID into the TagsItems table
--- as itemID? So something like this?

TagsItems
(
  tagid INT NOT NULL REFERENCES Tags.TagID,
  itemid INT NULL,  < really references Cars.carID and Planes.planeID
  schoolname varchar NULL  < Saves the Schools.schoolname
  itemid + tagId as Unique
)

I also have a question on the schoolname field, because it accepts varchar
not integer. There seems to be some design that would better fit my needs.
I'm asking  you guys for a little assistance.

-- 
Regards,
Jay Kang