[PERFORM] strategies for optimizing read on rather large tables

2005-06-04 Thread hubert lubaczewski
hi
first let me draw the outline.
we have a database which stores adverts.
each advert is in one category, and one or more region.
regions and categories form (each) tree structure.
assume category tree:

 a
/ \
   b   c
  / \
 d   e

if any given advert is in category e. it means it is also in b and
a.
same goes for regions.

as for now we have approx. 400 categories, 1300 regions, and 100
adverts.

since checking always over the tress of categories and regions we
created acr_cache table (advert/category/region)
which stores information on all adverts and all categories and regions
this particular region is in.
plus some more information for sorting purposes.

this table is ~ 11 milion records.

now.
we query this in more or less this manner:

select advert_id from acr_cache where category_id = ? and region_id = ?
order by XXX {asc|desc} limit 20;

where XXX is one of 5 possible fields,
timestamp,
timestamp,
text,
text,
numeric

we created index on acr_cache (category_id, region_id) 
and it works rather well.
usually.
if a given crossing (category + region) has small amount of ads (less
then 1) - the query is good enough (up to 300 miliseconds).
but when we enter the crossings which result in 5 ads - the query
takes up to 10 seconds.
which is almost forever.

we thought about creating indices like this:
index on acr_cache (effective_date);
where effective_dateis on of the timestamp fields.
it worked well for the crossings with lots of ads, but when we asked for
small crossing (like 1000 ads) it took  120 seconds!
it appears that postgresql was favorizing this new advert instead of
using much better index on category_id and region_id.

actually - i'm not sure what to do next.
i am even thinkinh about createing special indices (partial) for big
crossings, but that's just weird. plus the fact that already the
acr_cache vacuum time exceeds 3 hours!.


any suggestions?
hardware is dual xeon 3 ghz, 4G ram, hardware scsi raid put into raid 1.
settings in postgresql.conf:
listen_addresses = '*'
port = 5800
max_connections = 300
superuser_reserved_connections = 50
shared_buffers = 131072
work_mem = 4096
maintenance_work_mem = 65536
fsync = false
commit_delay = 100
commit_siblings = 5
checkpoint_segments = 10
effective_cache_size = 1
random_page_cost = 1.1
log_destination = 'stderr'
redirect_stderr = true
log_directory = '/home/pgdba/logs'
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
log_truncate_on_rotation = false
log_rotation_age = 1440
log_rotation_size = 502400
log_min_duration_statement = -1
log_connections = true
log_duration = true
log_line_prefix = '[%t] [%p] [EMAIL PROTECTED] '
log_statement = 'all'
stats_start_collector = true
stats_command_string = true
stats_block_level = true
stats_row_level = true
stats_reset_on_server_start = true
lc_messages = 'en_US.UTF-8'
lc_monetary = 'en_US.UTF-8'
lc_numeric = 'en_US.UTF-8'
lc_time = 'en_US.UTF-8'

actual max numer of connection is 120 plus some administrative connections 
(psql sessions).
postgresql version 8.0.2 on linux debian sarge.

best regards,

depesz

-- 
hubert lubaczewski
Network Operations Center
eo Networks Sp. z o.o.


signature.asc
Description: Digital signature


Re: [PERFORM] Query plan for very large number of joins

2005-06-04 Thread philb
 Despite being fairly restricted in scope,
 the schema is highly denormalized hence the large number of tables.

Do you mean normalized? Or do you mean you've pushed the superclass
details down onto each of the leaf classes?

Sorry, I meant normalized, typing faster than I'm thinking here:) The schema 
was generated by hyperjaxb, a combination of Hibernate and JAXB. This allows 
one to go from XSD - Object model - Persistance in a single step. I'm 
just getting the hang of Hibernate so I don't know how flexible its' strategy 
is. Obviously though, the emphasis is on correctness first so while the 
same result could possibly be achieved more quickly with many smaller queries, 
it probably considers that it's up to the DBMS to handle optimisation (not 
unreasonably either I guess) 

Since the entire process from the XSD onwards is automated, there's no scope 
for tweaking either the OR mapping code or the DB schema itself except for 
isolated troubleshooting purposes. The XSD set in question is the UBL schema 
published by OASIS which has about 650 relations, I thought it would be 
nice to have this as a standard component in future development. 

Regards,
 -phil



I guess I'm interested in what type of modelling led you to have so many
tables in the first place?

Gotta say, never seen 350 table join before in a real app.

Wouldn't it be possible to smooth out the model and end up with less
tables? Or simply break things up somewhere slightly down from the root
of the class hierarchy?

Best Regards, Simon Riggs


I'm using Vodafone Mail - to get your free mobile email account go to 
http://www.vodafone.ie
Use of Vodafone Mail is subject to Terms and Conditions  
http://www.vodafone.ie/terms/website



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] strategies for optimizing read on rather large tables

2005-06-04 Thread PFC




select advert_id from acr_cache where category_id = ? and region_id = ?
order by XXX {asc|desc} limit 20;

where XXX is one of 5 possible fields,
timestamp,
timestamp,
text,
text,
numeric


Create 5 indexes on ( category_id, region_id, a field )
where a field is one of your 5 fields.

Then write your query as :

select advert_id from acr_cache where category_id = ? and region_id = ?
order by category_id, region_id, XXX limit 20;

select advert_id from acr_cache where category_id = ? and region_id = ?
order by category_id desc, region_id desc, XXX desc limit 20;

This should put your query down to a millisecond. It will use the index  
for the lookup, the sort and the limit, and hence only retrieve 20 rows  
for the table. Downside is you  have 5 indexes, but that's not so bad.


If your categories and regions form a tree, you should definitely use a  
ltree datatype, which enables indexed operators like is contained in  
which would probably allow you to reduce the size of your cache table a  
lot.






we created index on acr_cache (category_id, region_id)
and it works rather well.
usually.
if a given crossing (category + region) has small amount of ads (less
then 1) - the query is good enough (up to 300 miliseconds).
but when we enter the crossings which result in 5 ads - the query
takes up to 10 seconds.
which is almost forever.

we thought about creating indices like this:
index on acr_cache (effective_date);
where effective_dateis on of the timestamp fields.
it worked well for the crossings with lots of ads, but when we asked for
small crossing (like 1000 ads) it took  120 seconds!
it appears that postgresql was favorizing this new advert instead of
using much better index on category_id and region_id.

actually - i'm not sure what to do next.
i am even thinkinh about createing special indices (partial) for big
crossings, but that's just weird. plus the fact that already the
acr_cache vacuum time exceeds 3 hours!.


any suggestions?
hardware is dual xeon 3 ghz, 4G ram, hardware scsi raid put into raid 1.
settings in postgresql.conf:
listen_addresses = '*'
port = 5800
max_connections = 300
superuser_reserved_connections = 50
shared_buffers = 131072
work_mem = 4096
maintenance_work_mem = 65536
fsync = false
commit_delay = 100
commit_siblings = 5
checkpoint_segments = 10
effective_cache_size = 1
random_page_cost = 1.1
log_destination = 'stderr'
redirect_stderr = true
log_directory = '/home/pgdba/logs'
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
log_truncate_on_rotation = false
log_rotation_age = 1440
log_rotation_size = 502400
log_min_duration_statement = -1
log_connections = true
log_duration = true
log_line_prefix = '[%t] [%p] [EMAIL PROTECTED] '
log_statement = 'all'
stats_start_collector = true
stats_command_string = true
stats_block_level = true
stats_row_level = true
stats_reset_on_server_start = true
lc_messages = 'en_US.UTF-8'
lc_monetary = 'en_US.UTF-8'
lc_numeric = 'en_US.UTF-8'
lc_time = 'en_US.UTF-8'

actual max numer of connection is 120 plus some administrative  
connections (psql sessions).

postgresql version 8.0.2 on linux debian sarge.

best regards,

depesz





---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] strategies for optimizing read on rather large tables

2005-06-04 Thread Christopher Kings-Lynne
Without reading too hard, I suggest having a quick look at contrib/ltree 
module in the PostgreSQL distribution.  It may or may not help you.


Chris

hubert lubaczewski wrote:

hi
first let me draw the outline.
we have a database which stores adverts.
each advert is in one category, and one or more region.
regions and categories form (each) tree structure.
assume category tree:

 a
/ \
   b   c
  / \
 d   e

if any given advert is in category e. it means it is also in b and
a.
same goes for regions.

as for now we have approx. 400 categories, 1300 regions, and 100
adverts.

since checking always over the tress of categories and regions we
created acr_cache table (advert/category/region)
which stores information on all adverts and all categories and regions
this particular region is in.
plus some more information for sorting purposes.

this table is ~ 11 milion records.

now.
we query this in more or less this manner:

select advert_id from acr_cache where category_id = ? and region_id = ?
order by XXX {asc|desc} limit 20;

where XXX is one of 5 possible fields,
timestamp,
timestamp,
text,
text,
numeric

we created index on acr_cache (category_id, region_id) 
and it works rather well.

usually.
if a given crossing (category + region) has small amount of ads (less
then 1) - the query is good enough (up to 300 miliseconds).
but when we enter the crossings which result in 5 ads - the query
takes up to 10 seconds.
which is almost forever.

we thought about creating indices like this:
index on acr_cache (effective_date);
where effective_dateis on of the timestamp fields.
it worked well for the crossings with lots of ads, but when we asked for
small crossing (like 1000 ads) it took  120 seconds!
it appears that postgresql was favorizing this new advert instead of
using much better index on category_id and region_id.

actually - i'm not sure what to do next.
i am even thinkinh about createing special indices (partial) for big
crossings, but that's just weird. plus the fact that already the
acr_cache vacuum time exceeds 3 hours!.


any suggestions?
hardware is dual xeon 3 ghz, 4G ram, hardware scsi raid put into raid 1.
settings in postgresql.conf:
listen_addresses = '*'
port = 5800
max_connections = 300
superuser_reserved_connections = 50
shared_buffers = 131072
work_mem = 4096
maintenance_work_mem = 65536
fsync = false
commit_delay = 100
commit_siblings = 5
checkpoint_segments = 10
effective_cache_size = 1
random_page_cost = 1.1
log_destination = 'stderr'
redirect_stderr = true
log_directory = '/home/pgdba/logs'
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
log_truncate_on_rotation = false
log_rotation_age = 1440
log_rotation_size = 502400
log_min_duration_statement = -1
log_connections = true
log_duration = true
log_line_prefix = '[%t] [%p] [EMAIL PROTECTED] '
log_statement = 'all'
stats_start_collector = true
stats_command_string = true
stats_block_level = true
stats_row_level = true
stats_reset_on_server_start = true
lc_messages = 'en_US.UTF-8'
lc_monetary = 'en_US.UTF-8'
lc_numeric = 'en_US.UTF-8'
lc_time = 'en_US.UTF-8'

actual max numer of connection is 120 plus some administrative connections 
(psql sessions).
postgresql version 8.0.2 on linux debian sarge.

best regards,

depesz



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] strategies for optimizing read on rather large tables

2005-06-04 Thread hubert lubaczewski
On Sat, Jun 04, 2005 at 07:17:17PM +0800, Christopher Kings-Lynne wrote:
 Without reading too hard, I suggest having a quick look at contrib/ltree 
 module in the PostgreSQL distribution.  It may or may not help you.

acr_cache doesn't care about trees. and - since i have acr_cache - i
dont have to worry about trees when selecting from acr_cache.

ltree - is known to me. yet i decided not to use it to have the ability
to move to another database engines without rewriting something that is
havily used.

depesz


signature.asc
Description: Digital signature


Re: [PERFORM] strategies for optimizing read on rather large tables

2005-06-04 Thread hubert lubaczewski
On Sat, Jun 04, 2005 at 01:18:04PM +0200, PFC wrote:
 Then write your query as :
 select advert_id from acr_cache where category_id = ? and region_id = ?
 order by category_id, region_id, XXX limit 20;

this is great idea - i'll check it out definitelly.

depesz


signature.asc
Description: Digital signature


Re: [PERFORM] strategies for optimizing read on rather large tables

2005-06-04 Thread PFC



select advert_id from acr_cache where category_id = ? and region_id = ?
order by category_id, region_id, XXX limit 20;


	don't forget to mention all the index columns in the order by, or the  
planner won't use it.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] strategies for optimizing read on rather large tables

2005-06-04 Thread hubert lubaczewski
On Sat, Jun 04, 2005 at 02:07:52PM +0200, PFC wrote:
   don't forget to mention all the index columns in the order by, or 
   the  planner won't use it.

of course.
i understand the concept. actually i find kind of ashamed i did not try
it before. 
anyway - thanks for great tip.

depesz


signature.asc
Description: Digital signature


[PERFORM] Best hardware

2005-06-04 Thread Bernd Jagla
Hi there,

And sorry for bringing this up again, but I couldn't find any recent
discussion on the best hardware, and I know it actually depends on what you
are doing...

So this is what I had in mind:

Our database is going to consist of about 100 tables or so of which only a
hand full will be really big, say in the 100 of million rows, fully indexed
and we are going to add a lot of entries (n* 100 000, n100) on a daily
bases (24/5). So from my experience with MySql I know that it is somewhat
hard on the I/O, and that the speed of the head of the HD is actually
limitiing. Also, I only experimented with RAID5, and heard that RAID10 will
be good for reading but not writing.

So I wanted to go whith RAIDKing. They have a 16 bay Raid box that they fill
with Raptors (10krpm,73 GB, SATA), connected via FC. Now I am not sure what
server would be good or if I should go with redundant servers. Are Quad CPUs
any good? I heard that the IBM quad system is supposed to be 40% faster than
HP or Dell???. And how much RAM should go for: are 8GB enough? Oh, of course
I wanted to run it under RedHat...

I would appreciate any sugestions and comments or if you are too bored with
this topic, just send me a link where I can read up on this

Thanks a lot for your kind replies.

Bernd


Bernd Jagla, PhD
Associate Research Scientist
Columbia University
 


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Best hardware

2005-06-04 Thread Mischa Sandberg
Quoting Bernd Jagla [EMAIL PROTECTED]:

 ... the speed of the head of the HD is actually
 limitiing. Also, I only experimented with RAID5, and heard that
 RAID10 will be good for reading but not writing.

Au contraire. RAID5 is worse than RAID10 for writing, because it has the
extra implicit read (parity stripe) for every write. I've switched all
my perftest boxes over from RAID5 to RAID10, and the smallest
performance increase was x1.6 . This is in an update-intensive system;
the WAL log's disk write rate was the controlling factor.

 Are Quad CPUs any good? I heard that the IBM quad system is supposed
to be 40%
 faster than HP or Dell???. 
Check out the other threads for negative experiences with Xeon 2x2 and
perhaps quad CPU's. Me, I'm looking forward to my first Opteron box
arriving next week.

 And how much RAM should go for: are 8GB enough? Oh, of course I wanted
to run it under RedHat...

First off, you need enough RAM to hold all your connections. Run your
app, watch the RSS column of ps. For my own simpler apps (that pump
data into the db) I allow 20MB/connection.

Next, if you are heavy on inserts, your tables will never fit in RAM,
and you really just need enough to hold the top levels of the indexes.
Look at the disk space used in your $PGDATA/base/dboid/tableoid
files, and you can work out whether holding ALL your indexes in memory
is feasible. 

If you are heavy on updates, the above holds, but ymmv depending on
locality of reference, you have to run your own tests. 

If you have concurrent big queries, all bets are off --- ask not how
much RAM you need, but how much you can afford :-)



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq