On Wed, 2014-06-18 at 18:08 -0600, Rob Sargent wrote: 

> On 06/18/2014 05:47 PM, Jason Long wrote:
> 
> 
> > I have a large table of access logs to an application.  
> > 
> > I want is to find all rows that overlap startdate and enddate with any
> > other rows.
> > 
> > The query below seems to work, but does not finish unless I specify a
> > single id.  
> > 
> > select distinct a1.id
> > from t_access a1, 
> >         t_access a2 
> > where tstzrange(a1.startdate, a1.enddate) && 
> >       tstzrange(a2.startdate, a2.enddate) 
> > 
> > 
> > 
> > 
> 
> I'm sure you're best bet is a windowing function, but your
> descriptions suggests there is no index on start/end date columns.
> Probably want those in any event.


There are indexs on startdate and enddate.
If I specify a known a1.id=1234 then the query returns all records that
overlap it, but this takes 1.7 seconds.

There are about 2 million records in the table.

I will see what I come up with on the window function.

If anyone else has some suggestions let me know.

I get with for EXPLAIN ANALYZE the id specified.

Nested Loop  (cost=0.43..107950.50 rows=8825 width=84) (actual
time=2803.932..2804.558 rows=11 loops=1)
   Join Filter: (tstzrange(a1.startdate, a1.enddate) &&
tstzrange(a2.startdate, a2.enddate))
   Rows Removed by Join Filter: 1767741
   ->  Index Scan using t_access_pkey on t_access a1  (cost=0.43..8.45
rows=1 width=24) (actual time=0.016..0.019 rows=1 loops=1)
         Index Cond: (id = 1928761)
   ->  Seq Scan on t_access a2  (cost=0.00..77056.22 rows=1764905
width=60) (actual time=0.006..1200.657 rows=1767752 loops=1)
         Filter: (enddate IS NOT NULL)
         Rows Removed by Filter: 159270
Total runtime: 2804.599 ms


and for EXPLAIN without the id specified.  EXPLAIN ANALYZE will not
complete without the id specified.

Nested Loop  (cost=0.00..87949681448.20 rows=17005053815 width=84)
   Join Filter: (tstzrange(a1.startdate, a1.enddate) &&
tstzrange(a2.startdate, a2.enddate))
   ->  Seq Scan on t_access a2  (cost=0.00..77056.22 rows=1764905
width=60)
         Filter: (enddate IS NOT NULL)
   ->  Materialize  (cost=0.00..97983.33 rows=1927022 width=24)
         ->  Seq Scan on t_access a1  (cost=0.00..77056.22 rows=1927022
width=24)

Reply via email to