[SQL] OR clause causing strange index performance

2004-05-20 Thread Doug Y
Hello,
  For the following query:
SELECT *
  FROM permissions p
   INNER JOIN users u
   ON u.id = p.id
   LEFT JOIN user_list ul1
  ON ul1.id = u.id
 AND ul1.type = '1'
   LEFT JOIN user_list ul2
  ON ul2.id = u.id
 AND ul2.type = '2'
   INNER JOIN lists l
   ON ( l.list_id1 = ul1.list_id1 AND l.list_id2 = ul1.list_id2 )
OR
  ( l.list_id1 = ul2.list_id1 AND l.list_id2 = ul2.list_id2 )
 WHERE
   p.code = '123456' AND p.type = 'User'
(lists table has ~ 500k records, users ~ 100k, permissions ~ 60k, user_list 
~ 530k)

lists can be associated with 2 users via the user_list table, and are 
designated by the 1 or 2, can have a user with a 1, a user with a 2 or one 
of each.

I'm getting really poor performance... about 60 seconds. Explain (see 
below) is showing its trying to use the pkey (list_id1,list_id2) on the 
list table, but not showing an index condition.

If I get rid of the OR, and only at one of the conditions it returns very 
quickly and properly set the index condition. I can't use a union because I 
would end up with duplicate rows for those that have both ul type 1 and 2

I actually started off trying the query by looking at lists first, but 
performance was awful since I can't narrow down the records like I can with 
permissions.

I know the tables aren't really set up ideally, and I actually have to join 
a few more tables to the lists table after the fact, but want to get the 
base running as efficient as possible first.

Is there any way to get this query to use the correct index condition so 
that it runs in a reasonable amount of time?

Thanks!
EXPLAIN with the OR
 QUERY 
PLAN

 Nested Loop  (cost=0.00..13051262.13 rows=1 width=1794)
   Join Filter: ((("inner".list_id1 = "outer".list_id1) OR 
("inner".list_id1 = "outer".list_id1)) AND (("inner".list_id2 = 
"outer".list_id2) OR ("inner".list_id1 = "outer".list_id1)) AND 
(("inner".list_id1 = "outer".list_id1) OR ("inner".list_id2 = 
"outer".list_id2)) AND (("inner".list_id2 = "outer".list_id2) OR 
("inner".list_id2 = "outer".list_id2)))
   ->  Nested Loop  (cost=0.00..2654.08 rows=12 width=1087)
 Join Filter: ("inner".type = '2'::character varying)
 ->  Nested Loop  (cost=0.00..427.39 rows=12 width=1032)
   Join Filter: ("inner".type = '1'::character varying)
   ->  Nested Loop  (cost=0.00..23.82 rows=2 width=977)
 ->  Index Scan using permissions_pkey on permissions 
p  (cost=0.00..12.14 rows=2 width=476)
   Index Cond: ((code = '123456'::character 
varying) AND (type = 'User'::character varying))
 ->  Index Scan using users_pkey on users 
u  (cost=0.00..4.92 rows=1 width=501)
   Index Cond: (u.id = "outer".id)
   ->  Index Scan using user_list_id on user_list 
ul1  (cost=0.00..159.86 rows=37 width=55)
 Index Cond: (ul1.id = "outer".id)
 ->  Index Scan using user_list_id on user_list 
ul2  (cost=0.00..159.86 rows=37 width=55)
   Index Cond: (ul2.id = "outer".id)
   ->  Seq Scan on lists 1  (cost=0.00..26103.61 rows=508361 width=707)
(16 rows)

Note: this example shows it trying a seq scan.. I've tried it with 
enable_seqscan off, too. When I referred above to it trying to use index 
scan, it was from an explain with an additional join to the lists table after:
->  Index Scan using lists_pkey on lists l  (cost=0.00..1872375.82 
rows=508361 width=144)

EXPLAIN without the OR
 QUERY 
PLAN

 Nested Loop  (cost=0.00..2740.09 rows=17 width=1794)
   ->  Nested Loop  (cost=0.00..2654.08 rows=12 width=1087)
 Join Filter: ("inner".type = '2'::character varying)
 ->  Nested Loop  (cost=0.00..427.39 rows=12 width=1032)
   Join Filter: ("inner".type = '1'::character varying)
   ->  Nested Loop  (cost=0.00..23.82 rows=2 width=977)
 ->  Index Scan using permissions_pkey on permissions 
p  (cost=0.00..12.14 rows=2 width=476)
   Index Cond: ((code = '123456'::character 
varying) AND (type = 'User'::character varying))
 ->  Index Scan using users_pkey on users 
u  (cost=0.00..4.92 rows=1 width=501)
   Index Cond: (u.id = "outer".id)
   ->  Index Scan using user_list_id on user_list 
ul1  (cost=0.00..159.86 rows=37 width=55)
 Index Cond: (ul1.id = "outer".id)
 ->  Index Scan using user_list_id on user_list 
ul2  (cost=0.00..159.86

Re: [SQL] OR clause causing strange index performance

2004-05-20 Thread Doug Y
Sorry,
  I just realized that my logic for the query is flawed anyway. It won't 
return the proper data set I'm after. I'll have to go back to looking at 
the lists table first.

  I still guess knowing why the query below isn't as quick as expected 
could be useful though.

At 01:32 PM 5/20/2004, Doug Y wrote:
Hello,
  For the following query:
SELECT *
  FROM permissions p
   INNER JOIN users u
   ON u.id = p.id
   LEFT JOIN user_list ul1
  ON ul1.id = u.id
 AND ul1.type = '1'
   LEFT JOIN user_list ul2
  ON ul2.id = u.id
 AND ul2.type = '2'
   INNER JOIN lists l
   ON ( l.list_id1 = ul1.list_id1 AND l.list_id2 = ul1.list_id2 )
OR
  ( l.list_id1 = ul2.list_id1 AND l.list_id2 = ul2.list_id2 )
 WHERE
   p.code = '123456' AND p.type = 'User'
(lists table has ~ 500k records, users ~ 100k, permissions ~ 60k, 
user_list ~ 530k)

lists can be associated with 2 users via the user_list table, and are 
designated by the 1 or 2, can have a user with a 1, a user with a 2 or one 
of each.

I'm getting really poor performance... about 60 seconds. Explain (see 
below) is showing its trying to use the pkey (list_id1,list_id2) on the 
list table, but not showing an index condition.

If I get rid of the OR, and only at one of the conditions it returns very 
quickly and properly set the index condition. I can't use a union because 
I would end up with duplicate rows for those that have both ul type 1 and 2

I actually started off trying the query by looking at lists first, but 
performance was awful since I can't narrow down the records like I can 
with permissions.

I know the tables aren't really set up ideally, and I actually have to 
join a few more tables to the lists table after the fact, but want to get 
the base running as efficient as possible first.

Is there any way to get this query to use the correct index condition so 
that it runs in a reasonable amount of time?

Thanks!
- cut explains off - 

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[SQL] I'm stumped

2004-12-20 Thread Doug Y
I can't figure out an efficient way to do this. Basically I had a typical 3-tier relationship:
(Employee -> Department -> Division)
However, at some point the need to move employees arose, but instead of changing the key in the emp table, we now have 
an over-ride table, so a history can be tracked.

If I want to get the info for a particular employee, its a pretty simple deal, however, getting all the employees for a 
dept or division has become troublesome.

A very simplified schema:
divisions ( div_id, div_name );
departments ( dept_id, dept_name, div_id );
employees ( emp_id, emp_name, dept_id );
emp_dept ( emp_id, dept_id, active, changed_by, changed_when );
The original way that worked well:
SELECT v.div_name, d.dept_id, e.emp_id, e.emp_name
  FROM divisions v
   INNER JOIN departments d
   ON d.div_id = v.div_id
   INNER JOIN employees e
   ON e.dept_id = d.dept_id
 WHERE v.div_id = 123;
What was initially tried:
SELECT v.div_name, COALESCE(ed.dept_id, d.dept_id), e.emp_id, e.emp_name
  FROM divisions v
   INNER JOIN departments d
   ON d.div_id = v.div_id
   INNER JOIN employees e
   ON e.dept_id = d.dept_id
   LEFT JOIN emp_dept ed
  ON ed.emp_id = e.emp_id AND ed.active = true
 WHERE v.div_id = 123;
This query is flawed, as it still always puts the employees in their original div, but reports the new dept. Which we 
didn't catch as a problem until emps were moved to depts in a different division.

I tried creating a function:
CREATE OR REPLACE FUNCTION get_empdept(int4) RETURNS int4 AS '
SELECT CASE WHEN ed.dept_id IS NOT NULL
THEN ed.dept_id
ELSE e.dept_id END
  FROM employees AS e
   LEFT JOIN emp_dept AS ed
  ON ed.emp_id = e.emp_id AND ed.active = true
 WHERE e.emp_id = $1
' LANGUAGE SQL STABLE;
And then tried:
SELECT v.div_name, d.dept_id, e.emp_id, e.emp_name
  FROM divisions v
   INNER JOIN departments d
   ON d.div_id = v.div_id
   INNER JOIN employees e
   ON get_empdept(e.emp_id) = d.dept_id
 WHERE v.div_id = 123;
However since the function is not immutable (since it does a query), I can't create an index, and the join always does a 
seq scan.

I also thought to create a view, but I don't believe Postgres supports indexed 
views. It was always using a seq scan too.
The above examples are actually quite simplified, as several other tables get joined along the way, I'm not sure a UNION 
would work or not, how would it exclude the ones that match the dept_id in the emp table for those emps that match on 
the over-ride table?

Any suggestions?
Thanks
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html