Re: [PERFORM] difficulties with time based queries

2009-04-14 Thread Nikolas Everett
> >If you often do range queries on date, consider partitioning your > table by date (something like 1 partition per month). >Of course, if you also often do range queries on something other > than date, and uncorrelated, forget it. If you pick your partition to line up with your

Re: [PERFORM] difficulties with time based queries

2009-04-14 Thread PFC
What can I do to prevent the index from getting bloated, or in whatever state it was in? What else can I do to further improve queries on this table? Someone suggested posting details of my conf file. Which settings are most likely to be useful for this? If you often do range queries on

Re: [PERFORM] difficulties with time based queries

2009-04-09 Thread Tom Lane
"Rainer Mager" writes: > So, I need indices that make it fast querying against start_time as well as > all possible combinations of channel, player, and ad. There's some general principles in the manual --- have you read http://www.postgresql.org/docs/8.3/static/indexes.html especially 11.3 and 1

Re: [PERFORM] difficulties with time based queries

2009-04-09 Thread Rainer Mager
Thanks for all of the suggestions so far. I've been trying to reduce the number of indices I have, but I'm running into a problem. I have a need to do queries on this table with criteria applied to the date and possibly any or all of the other key columns. As a reminder, here's my table:

Re: [PERFORM] difficulties with time based queries

2009-04-08 Thread Rainer Mager
So, I defragged my disk and reran my original query and it got a little better, but still far higher than I'd like. I then rebuilt (dropped and recreated) the ad_log_date_all index and reran the query and it is quite a bit better: # explain analyze select * from ad_log where date(start_time) <

Re: [PERFORM] difficulties with time based queries

2009-04-06 Thread Matthew Wakeling
On Mon, 6 Apr 2009, Rainer Mager wrote: The data should be mostly ordered by date. It is all logged in semi-realtime such that 99% will be logged within an hour of the timestamp. Also, as stated above, during this query it was about 1.2 MB/s, which I know isn't great. I admit this isn't the best

Re: [PERFORM] difficulties with time based queries

2009-04-06 Thread Robert Haas
On Sun, Apr 5, 2009 at 11:35 PM, Rainer Mager wrote: >> -Original Message- >> From: Tom Lane [mailto:t...@sss.pgh.pa.us] >> "Rainer Mager" writes: >> >> From: Tom Lane [mailto:t...@sss.pgh.pa.us] >> >> Hmm ... it's pretty unusual to see the index fetch portion of a >> bitmap >> >> scan ta

Re: [PERFORM] difficulties with time based queries

2009-04-05 Thread Rainer Mager
> -Original Message- > From: Tom Lane [mailto:t...@sss.pgh.pa.us] > "Rainer Mager" writes: > >> From: Tom Lane [mailto:t...@sss.pgh.pa.us] > >> Hmm ... it's pretty unusual to see the index fetch portion of a > bitmap > >> scan take the bulk of the runtime. Usually that part is fast and

Re: [PERFORM] difficulties with time based queries

2009-04-05 Thread Tom Lane
"Rainer Mager" writes: >> From: Tom Lane [mailto:t...@sss.pgh.pa.us] >> Hmm ... it's pretty unusual to see the index fetch portion of a bitmap >> scan take the bulk of the runtime. Usually that part is fast and where >> the pain comes is in fetching from the heap. I wonder whether that >> index

Re: [PERFORM] difficulties with time based queries

2009-04-05 Thread Rainer Mager
Thanks for all the replies, I'll try to address the follow up questions: > From: David Wilson [mailto:david.t.wil...@gmail.com] > > The stats look good and it's using a viable index for your query. What > kind of hardware is this on, and what are the relevant postgresql.conf > lines? (Or, for tha

Re: [PERFORM] difficulties with time based queries

2009-04-05 Thread Tom Lane
"Rainer Mager" writes: > explain analyze select * from ad_log where date(start_time) < > date('2009-03-31') and date(start_time) >= date('2009-03-30'); > Bitmap Heap Scan on ad_log (cost=73372.57..3699152.24 rows=2488252 > width=32) (actual time=49792.862..64611.255 rows=2268490 loops=1) >Re

Re: [PERFORM] difficulties with time based queries

2009-04-05 Thread PFC
When I try to do queries on this table I always find them slower than what I need and what I believe should be possible. -> Bitmap Index Scan on ad_log_date_all (cost=0.00..72750.51 rows=2488252 width=0) (actual time=49776.332..49776.332 rows=2268490 loops=1) Index Cond: ((date(

Re: [PERFORM] difficulties with time based queries

2009-04-05 Thread David Wilson
On Sun, Apr 5, 2009 at 7:26 PM, Rainer Mager wrote: > Bitmap Heap Scan on ad_log  (cost=73372.57..3699152.24 rows=2488252 > width=32) (actual time=49792.862..64611.255 rows=2268490 loops=1) > >    Recheck Cond: ((date(start_time) < '2009-03-31'::date) AND > (date(start_time) >= '2009-03-30'::date)

[PERFORM] difficulties with time based queries

2009-04-05 Thread Rainer Mager
I have a somewhat large table (more than 100 million rows) that contains log data with start_time and end_time columns. When I try to do queries on this table I always find them slower than what I need and what I believe should be possible. For example, I limited the following query to just a s