subject:"Inefficient query processing\?"

Re: Inefficient query processing?

2008-02-11 Thread Perrin Harkins

On Feb 10, 2008 5:30 PM, Yves Goergen [EMAIL PROTECTED] wrote:
 My problem is that the sub-select in line 7 (SELECT 1) takes a rather
 long time. (When I remove it, it's much faster.)

This is a known issue with EXISTS/NOT EXISTS subqueries in MySQL,
which has some fixes slated for MySQL 6.  In the meantime, there are
ways to rewrite most of these queries using JOIN/LEFT JOIN.  Read some
of the articles on http://xaprb.com/ about subqueries for a more
detailed explanation and examples of rewrites.

- Perrin

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Inefficient query processing?

2008-02-11 Thread Yves Goergen


On 11.02.2008 19:51 CE(S)T, Perrin Harkins wrote:

On Feb 10, 2008 5:30 PM, Yves Goergen [EMAIL PROTECTED] wrote:

My problem is that the sub-select in line 7 (SELECT 1) takes a rather
long time. (When I remove it, it's much faster.)


This is a known issue with EXISTS/NOT EXISTS subqueries in MySQL,
which has some fixes slated for MySQL 6.  In the meantime, there are
ways to rewrite most of these queries using JOIN/LEFT JOIN.  Read some
of the articles on http://xaprb.com/ about subqueries for a more
detailed explanation and examples of rewrites.


Thank you for the link. Is there some way to get only the headlines and 
a summary for all entries? Reading through the entire contents by month 
and finding the misleading captions is hard work for such masses of 
content. The search function didn't give me the desired results.


--
Yves Goergen LonelyPixel [EMAIL PROTECTED]
Visit my web laboratory at http://beta.unclassified.de

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Inefficient query processing?

2008-02-11 Thread Peter Brawley


Yves

Is there some way to get only the headlines ...

For brief discussion  some examples see 'The [Not] Exists query 
pattern' at http://www.artfulsoftware.com/infotree/queries.php.


PB

Yves Goergen wrote:

On 11.02.2008 19:51 CE(S)T, Perrin Harkins wrote:
On Feb 10, 2008 5:30 PM, Yves Goergen [EMAIL PROTECTED] 
wrote:

My problem is that the sub-select in line 7 (SELECT 1) takes a rather
long time. (When I remove it, it's much faster.)


This is a known issue with EXISTS/NOT EXISTS subqueries in MySQL,
which has some fixes slated for MySQL 6.  In the meantime, there are
ways to rewrite most of these queries using JOIN/LEFT JOIN.  Read some
of the articles on http://xaprb.com/ about subqueries for a more
detailed explanation and examples of rewrites.


Thank you for the link. Is there some way to get only the headlines 
and a summary for all entries? Reading through the entire contents by 
month and finding the misleading captions is hard work for such masses 
of content. The search function didn't give me the desired results.




--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Inefficient query processing?

2008-02-11 Thread Perrin Harkins

On Feb 11, 2008 4:46 PM, Yves Goergen [EMAIL PROTECTED] wrote:
 Thank you for the link. Is there some way to get only the headlines and
 a summary for all entries? Reading through the entire contents by month
 and finding the misleading captions is hard work for such masses of
 content. The search function didn't give me the desired results.

I use Google:
http://www.google.com/search?q=site%3Axaprb.com+%22NOT+EXISTS%22

- Perrin

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Inefficient query processing?

2008-02-11 Thread Peter Brawley


Yves

If user.additionalkeylist and tag.readaccesskeylist are not lists, 
naming them `...list` misleads  distracts.


You asked earlier how to fit my preliminary solution into your problem. 
The answer is to (i) write the query that lists access-denied messages, 
then (ii) write a simple exclusion join from messages to that derived table.


But on (i), how user.additionalkeylist and tag.readaccesskeylist work 
remains confusing. You appear to say access may come from ...

(i) message-message_revision-message_revision_tag.readaccesskeylist, or
(ii) message_revision-user.additionalkeylist
which implies there are positive values which provide access, but your 
original query used the condition


   readaccesskeylist /is not null/

as a test for access /refusal/, which seems to contradict what you now say.

PB

-

Yves Goergen wrote:

On 11.02.2008 17:32 CE(S)T, Peter Brawley wrote:
1. user.additionalkeylist and tag.readaccesskeylist are atomic 
despite their names?


Yes, I forgot the types. Everything is scalar, varchar or integer. 
There are not set or otherwise complex data types.


2. You have reciprocal foreign keys, keylist.key referencing 
user(userID) and user.additionalkeylist referencing keylist.keylistID?


Basically, yes. Although there is a contraint in my application that 
is not visible in the database structure: I distinguish between 
personal and virtual keys. Personal keys must not be part of a 
user's additional keys list. Virtual keys must not have an additional 
keys list on their own. (And they must not have logon information.) So 
there cannot be a cyclic reference. This is documented in the source 
code and will be enforced on the application layer later.

Re: Inefficient query processing?

2008-02-11 Thread Peter Brawley


Yves,

Okay. Then the long form.

1. user.additionalkeylist and tag.readaccesskeylist are atomic despite 
their names?


2. You have reciprocal foreign keys, keylist.key referencing 
user(userID) and user.additionalkeylist referencing keylist.keylistID?


PB

Yves Goergen wrote:

On 11.02.2008 11:18 CE(S)T, Peter Brawley wrote:

Unclear.


Okay. Then the long form.

My application is a messaging application that supports multiple 
users, messages with revisions, tags and access control.


A user is identified by a UserId which I also call key. (Imagine it 
like the key you have for your front door.) A user can be assigned 
additional keys that give him way to additional messages. (Imagine it 
like the key for your office and the neighour's.) A user is also the 
author of a message revision.


A keylist supports storing multiple keys for another entity. This 
table is only a product of database normalisation. It is identified by 
a KeylistId.


Tags are predefined, i.e. an author cannot use tags that are not 
already there. Tags are identified by their TagId. Tags can also be 
used to restrict access to messages that have them assigned. For this, 
a tag can have a list of keys that can be used to access the message. 
If a keylist is assigned (ReadAccessKeylist IS NOT NULL), then any key 
from the list will do. E.g. you can restrict access on a certain tag 
to the sales and development departments, locking out research dept. 
and anonymous guests. If no key list is defined (ReadAccessKeylist IS 
NULL), then no access restrictions are imposed on that tag.


A message contains multiple revisions (can be one, too) that keep the 
actual data, the message content, subject, timestamp etc. A message 
has a MessageId, a revision is identified by a MessageId and a 
RevisionNumber, both starting at 1. A message can have its own 
individual access keylist that behaves the same was as for tags, but 
it only applies to the single respective message.


A message revision can be assigned tags that are used for 
classification and access control. These tag associations are stored 
in the message_revision_tag table. For search purposes, a single 
message revision is linked to that will be regarded. This is the 
SearchRevision of a message that is determined by other conditions and 
stored persistently.


This is the relevant part of the database schema:
message (MessageId)
message_revision (MessageId references message, RevisionNumber, Author
  references user(UserId), CreatedTime, ...)
keylist (KeylistId, Key references user(UserId))
user (UserId, AdditionalKeylist references keylist(KeylistId))
tag (TagId, ReadAccessKeylist references keylist(KeylistId))
message_revision_tag (MessageId, RevisionNumber, TagId references tag)

One additional constraint:
message_revision_tag (MessageId, RevisionNumber) reference to
  message_revision

The main search query only retrieves MessageIds. It must only return 
messages that the current session user has access to. This access may 
come from the messages' ReadAccessKeylist or any of the assigned tags' 
ReadAccessKeylist. The session user can be granted access for his own 
personal key (UserId) or one of the additional keys in his keyring 
(AdditionalKeylist). If there are multiple access lists, the user must 
pass all of them to get the message. I.e. if a message has an 
individual keylist and also some tags assigned that restrict access, 
the user must be in each of those key lists to get access.


This is probably the most tricky part to understand. Please tell me if 
you have questions.


The entire SQL query is now:

 SELECT m.MessageId
 FROM message m

WHERE
  -- Grant all access to administrators (so that they can alter the 
access key lists)

  (:isAdmin OR
-- Message has no read access keylist
(m.ReadAccessKeylist IS NULL
OR EXISTS
  -- Message's read access keylist contains personal or 
additional user key

  (SELECT 1
  FROM message m2, keylist m2k
  WHERE m2k.KeylistId = m2.ReadAccessKeylist AND
m2.MessageId = m.MessageId AND
(m2k.UserId = :sessionUserId OR
  m2k.UserId IN
(SELECT uk.UserId
FROM user u, keylist uk
WHERE uk.KeylistId = u.AdditionalKeylist AND
  u.UserId = :sessionUserId
AND NOT EXISTS
  -- There must not be any tag that denies access
  (SELECT
EXISTS
  -- Tag's read access keylist contains personal or 
additional user key

  (SELECT 1
  FROM keylist tk
  WHERE tk.KeylistId = t.ReadAccessKeylist AND
(tk.UserId = :sessionUserId OR
  tk.UserId IN
(SELECT uk.UserId
FROM user u, keylist uk
WHERE uk.KeylistId = u.AdditionalKeylist AND
  u.UserId = :sessionUserId)))
AS Allowed
  FROM message_revision_tag mrt
JOIN tag t USING (TagId)
  WHERE mrt.MessageId = m.MessageId AND
mrt.RevisionNumber =

Re: Inefficient query processing?

2008-02-11 Thread Yves Goergen


On 11.02.2008 20:13 CE(S)T, Peter Brawley wrote:
If user.additionalkeylist and tag.readaccesskeylist are not lists, 
naming them `...list` misleads  distracts.


Well, these fields contain KeylistId values from the keylist table, so 
I thought naming them *Keylist would be good enough.


But on (i), how user.additionalkeylist and tag.readaccesskeylist work 
remains confusing. You appear to say access may come from ...

(i) message-message_revision-message_revision_tag.readaccesskeylist, or
(ii) message_revision-user.additionalkeylist
which implies there are positive values which provide access, but your 
original query used the condition


readaccesskeylist /is not null/

as a test for access /refusal/, which seems to contradict what you now say.


message.ReadAccessKeylist and message.SearchRevision- 
message_revision_tag.ReadAccessKeylist are a list of keys of which *one* 
is required to get in.


user.AdditionalKeylist is a list of keys that the user possesses and of 
which *one* can be used to get in.


One list contains the keys that can be used, the other two lists contain 
keys that are allowed. One gives keys, the other accept keys. Imagine it 
like the user coming along with a keyring, trying to open a door with 
multiple keyholes. One of his keys must fit in one of the keyholes to 
get in.


And the problem here is that I need to test whether there is not a 
single tag for a (known) revision of a message that has an associated 
keylist to which no keys of the session user fits. If there was such a 
tag, access would be denied. To grant access, there must only be tags 
that either have ReadAccessKeylist IS NULL or that contain a key to 
which one of the session user's additional keys fits.


I'm still wondering if there's a way to explain all this better with 
some graphics.


--
Yves Goergen LonelyPixel [EMAIL PROTECTED]
Visit my web laboratory at http://beta.unclassified.de

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Inefficient query processing?

2008-02-11 Thread Yves Goergen


On 11.02.2008 17:32 CE(S)T, Peter Brawley wrote:
1. user.additionalkeylist and tag.readaccesskeylist are atomic despite 
their names?


Yes, I forgot the types. Everything is scalar, varchar or integer. There 
are not set or otherwise complex data types.


2. You have reciprocal foreign keys, keylist.key referencing 
user(userID) and user.additionalkeylist referencing keylist.keylistID?


Basically, yes. Although there is a contraint in my application that is 
not visible in the database structure: I distinguish between personal 
and virtual keys. Personal keys must not be part of a user's 
additional keys list. Virtual keys must not have an additional keys list 
on their own. (And they must not have logon information.) So there 
cannot be a cyclic reference. This is documented in the source code and 
will be enforced on the application layer later.


--
Yves Goergen LonelyPixel [EMAIL PROTECTED]
Visit my web laboratory at http://beta.unclassified.de

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Inefficient query processing?

2008-02-11 Thread Yves Goergen


On 11.02.2008 00:29 CE(S)T, Peter Brawley wrote:

message (messageID)
keylist (keylistID)
tag ( tagID, readaccesskeylist references keylist(keylistID) )
message_revision_tag ( ???, messageID references message(messageID), 
tagID references tag(tagID))


Another table:
message_revision(MessageId references message, RevisionNumber)

Correction:
message_revision_tag(MessageId, RevisionNumber, TagId)
  (MessageId, RevisionNumber) references message_revision


(i) Finding messages which have a deny-access tag looks like a simple join:

SELECT DISTINCT messageID
FROM message_revision_tag AS mrt
JOIN tag AS t ON mrt.tagID=t.tagID
WHERE t.readaccesskeylist IS NOT NULL;

(ii) We get the messages not in the above result with a simple exclusion 
join:


SELECT messageID
FROM message m
LEFT JOIN (
  SELECT DISTINCT messageID
  FROM message_revision_tag AS mrt
  JOIN tag AS t ON mrt.tagID=t.tagID
  WHERE t.readaccesskeylist IS NOT NULL
) AS banned USING (messageID)
WHERE banned.messageID IS NULL,

Or did I miss something?


I'm afraid I cannot integrate this in my large query. It looks too 
simple and I don't know where to put its parts. Maybe I'll really have 
to show the full schema and the complete query...


It's not only that my entire query will find messages that have no tag 
with a ReadAccessKeylist assigned; it will rather find messages that 
have no tag with a keylist which does not include the currently logged 
in user's UserId or one of this user's additional keys, which are again 
stored in a keylist.


--
Yves Goergen LonelyPixel [EMAIL PROTECTED]
Visit my web laboratory at http://beta.unclassified.de

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Inefficient query processing?

2008-02-11 Thread Yves Goergen


On 11.02.2008 11:18 CE(S)T, Peter Brawley wrote:

Unclear.


Okay. Then the long form.

My application is a messaging application that supports multiple users, 
messages with revisions, tags and access control.


A user is identified by a UserId which I also call key. (Imagine it 
like the key you have for your front door.) A user can be assigned 
additional keys that give him way to additional messages. (Imagine it 
like the key for your office and the neighour's.) A user is also the 
author of a message revision.


A keylist supports storing multiple keys for another entity. This table 
is only a product of database normalisation. It is identified by a 
KeylistId.


Tags are predefined, i.e. an author cannot use tags that are not already 
there. Tags are identified by their TagId. Tags can also be used to 
restrict access to messages that have them assigned. For this, a tag can 
have a list of keys that can be used to access the message. If a keylist 
is assigned (ReadAccessKeylist IS NOT NULL), then any key from the list 
will do. E.g. you can restrict access on a certain tag to the sales and 
development departments, locking out research dept. and anonymous 
guests. If no key list is defined (ReadAccessKeylist IS NULL), then no 
access restrictions are imposed on that tag.


A message contains multiple revisions (can be one, too) that keep the 
actual data, the message content, subject, timestamp etc. A message has 
a MessageId, a revision is identified by a MessageId and a 
RevisionNumber, both starting at 1. A message can have its own 
individual access keylist that behaves the same was as for tags, but it 
only applies to the single respective message.


A message revision can be assigned tags that are used for classification 
and access control. These tag associations are stored in the 
message_revision_tag table. For search purposes, a single message 
revision is linked to that will be regarded. This is the SearchRevision 
of a message that is determined by other conditions and stored persistently.


This is the relevant part of the database schema:
message (MessageId)
message_revision (MessageId references message, RevisionNumber, Author
  references user(UserId), CreatedTime, ...)
keylist (KeylistId, Key references user(UserId))
user (UserId, AdditionalKeylist references keylist(KeylistId))
tag (TagId, ReadAccessKeylist references keylist(KeylistId))
message_revision_tag (MessageId, RevisionNumber, TagId references tag)

One additional constraint:
message_revision_tag (MessageId, RevisionNumber) reference to
  message_revision

The main search query only retrieves MessageIds. It must only return 
messages that the current session user has access to. This access may 
come from the messages' ReadAccessKeylist or any of the assigned tags' 
ReadAccessKeylist. The session user can be granted access for his own 
personal key (UserId) or one of the additional keys in his keyring 
(AdditionalKeylist). If there are multiple access lists, the user must 
pass all of them to get the message. I.e. if a message has an individual 
keylist and also some tags assigned that restrict access, the user must 
be in each of those key lists to get access.


This is probably the most tricky part to understand. Please tell me if 
you have questions.


The entire SQL query is now:

 SELECT m.MessageId
 FROM message m

WHERE
  -- Grant all access to administrators (so that they can alter the access key 
lists)
  (:isAdmin OR
-- Message has no read access keylist
(m.ReadAccessKeylist IS NULL
OR EXISTS
  -- Message's read access keylist contains personal or additional user key
  (SELECT 1
  FROM message m2, keylist m2k
  WHERE m2k.KeylistId = m2.ReadAccessKeylist AND
m2.MessageId = m.MessageId AND
(m2k.UserId = :sessionUserId OR
  m2k.UserId IN
(SELECT uk.UserId
FROM user u, keylist uk
WHERE uk.KeylistId = u.AdditionalKeylist AND
  u.UserId = :sessionUserId
AND NOT EXISTS
  -- There must not be any tag that denies access
  (SELECT
EXISTS
  -- Tag's read access keylist contains personal or additional user key
  (SELECT 1
  FROM keylist tk
  WHERE tk.KeylistId = t.ReadAccessKeylist AND
(tk.UserId = :sessionUserId OR
  tk.UserId IN
(SELECT uk.UserId
FROM user u, keylist uk
WHERE uk.KeylistId = u.AdditionalKeylist AND
  u.UserId = :sessionUserId)))
AS Allowed
  FROM message_revision_tag mrt
JOIN tag t USING (TagId)
  WHERE mrt.MessageId = m.MessageId AND
mrt.RevisionNumber = m.SearchRevision AND
t.ReadAccessKeylist IS NOT NULL   -- Only regard tags with read 
access keylist (only they can fail the access test)

   HAVING NOT Allowed)

:isAdmin is 0 or 1 depending on whether the session user is an 
administrator. Admins always get the result

Re: Inefficient query processing?

2008-02-11 Thread Peter Brawley


Yves

it will rather find messages that have no tag with a keylist
which does not include the currently logged in user's UserId
or one of this user's additional keys, which are again stored
in a keylist.

Unclear.

PB

Yves Goergen wrote:

On 11.02.2008 00:29 CE(S)T, Peter Brawley wrote:

message (messageID)
keylist (keylistID)
tag ( tagID, readaccesskeylist references keylist(keylistID) )
message_revision_tag ( ???, messageID references message(messageID), 
tagID references tag(tagID))


Another table:
message_revision(MessageId references message, RevisionNumber)

Correction:
message_revision_tag(MessageId, RevisionNumber, TagId)
  (MessageId, RevisionNumber) references message_revision

(i) Finding messages which have a deny-access tag looks like a simple 
join:


SELECT DISTINCT messageID
FROM message_revision_tag AS mrt
JOIN tag AS t ON mrt.tagID=t.tagID
WHERE t.readaccesskeylist IS NOT NULL;

(ii) We get the messages not in the above result with a simple 
exclusion join:


SELECT messageID
FROM message m
LEFT JOIN (
  SELECT DISTINCT messageID
  FROM message_revision_tag AS mrt
  JOIN tag AS t ON mrt.tagID=t.tagID
  WHERE t.readaccesskeylist IS NOT NULL
) AS banned USING (messageID)
WHERE banned.messageID IS NULL,

Or did I miss something?


I'm afraid I cannot integrate this in my large query. It looks too 
simple and I don't know where to put its parts. Maybe I'll really have 
to show the full schema and the complete query...


It's not only that my entire query will find messages that have no tag 
with a ReadAccessKeylist assigned; it will rather find messages that 
have no tag with a keylist which does not include the currently logged 
in user's UserId or one of this user's additional keys, which are 
again stored in a keylist.




--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Inefficient query processing?

2008-02-10 Thread Yves Goergen


Hi,

I have a performance problem with one of my SQL queries. It's a rather 
complex one so I'll spare you the details. This is the situation:


In my system, there's messages, tags and keylists. Each message has 
message_revisions, each message_revision can be assigned tags (stored in 
message_revision_tag). Each tag points to a keylist that contains all 
keys that grant access to messages with that tag. If a tag has no 
keylist assigned (ReadAccessKeylist IS NULL), then everybody may access 
the messages.


This query finds all messages that don't have a tag assigned that would 
deny access to it. (Assume every message has only a single revision with 
the number 1, for now. The actual user comparison with another 
sub-select is hidden in some more.)


SELECT m.MessageId
FROM message m
WHERE
  NOT EXISTS
(SELECT
  EXISTS
(SELECT 1
FROM keylist tk
WHERE tk.KeylistId = t.ReadAccessKeylist AND some more)
  AS Allowed
FROM message_revision_tag mrt
  JOIN tag t USING (TagId)
WHERE mrt.MessageId = m.MessageId AND
  mrt.RevisionNumber = 1 AND
  t.ReadAccessKeylist IS NOT NULL
HAVING NOT Allowed)

My problem is that the sub-select in line 7 (SELECT 1) takes a rather 
long time. (When I remove it, it's much faster.) I'm not sure why, 
because there's not a single keylist in that table, however. Another 
issue is that this query should actually never be regarded. The 
condition in the second-last line is always false. A simple test 
confirms that:


SELECT COUNT(*) FROM tag WHERE ReadAccessKeylist IS NOT NULL
- 0

So there should never be a reason why the FROM in line 11 would result 
in a row (filtering out with the conditions, of course). But it still 
gets executed. When I make sure that the condition is always false, by 
adding a AND 0 just before the HAVING clause, the whole thing runs 
much faster.


I have no separate timings, but it's in the magnitude of 1 vs. 5-10 
milliseconds for the query. Run a hundred times makes a noticeable delay.


My understanding of it all was that first the FROM clause is regarded to 
see what rows there are. Then WHERE filters them, then SELECT will pick 
some columns (and thereby execute my sub-select expression) and finally 
HAVING filters again. Since in my theory there is no single row, SELECT 
has nothing to do. But obviously it has.


Some suggestion what's going on?

--
Yves Goergen LonelyPixel [EMAIL PROTECTED]
Visit my web laboratory at http://beta.unclassified.de

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Inefficient query processing?

2008-02-10 Thread Peter Brawley


Yves,

 My problem is that the sub-select in line 7
 (SELECT 1) takes a rather long time

It might be possible to simplify. Do I have the schema right?

message (messageID)
keylist (keylistID)
tag ( tagID, readaccesskeylist references keylist(keylistID) )
message_revision_tag ( ???, messageID references message(messageID), 
tagID references tag(tagID))


Or is there a message_revisions table missing from your description? On 
what I have, if the condition tag.readaccesskeylist IS NOT NULL defines 
a 'banned' condition, perhaps the query can be written without 
referencing the keylist table, since the requirement simply translates 
to finding all messages with no tag having a non-null readaccesskeylist, 
i.e. ...

(i} find the messages which have a non-null tag.readaccesskeylist,
(ii) find the messages which are not in [i].

(i) Finding messages which have a deny-access tag looks like a simple join:

SELECT DISTINCT messageID
FROM message_revision_tag AS mrt
JOIN tag AS t ON mrt.tagID=t.tagID
WHERE t.readaccesskeylist IS NOT NULL;

(ii) We get the messages not in the above result with a simple exclusion 
join:


SELECT messageID
FROM message m
LEFT JOIN (
 SELECT DISTINCT messageID
 FROM message_revision_tag AS mrt
 JOIN tag AS t ON mrt.tagID=t.tagID
 WHERE t.readaccesskeylist IS NOT NULL
) AS banned USING (messageID)
WHERE banned.messageID IS NULL,

Or did I miss something?

PB
http://www.artfulsoftware.com

-

Yves Goergen wrote:

Hi,

I have a performance problem with one of my SQL queries. It's a rather 
complex one so I'll spare you the details. This is the situation:


In my system, there's messages, tags and keylists. Each message has 
message_revisions, each message_revision can be assigned tags (stored 
in message_revision_tag). Each tag points to a keylist that contains 
all keys that grant access to messages with that tag. If a tag has no 
keylist assigned (ReadAccessKeylist IS NULL), then everybody may 
access the messages.


This query finds all messages that don't have a tag assigned that 
would deny access to it. (Assume every message has only a single 
revision with the number 1, for now. The actual user comparison with 
another sub-select is hidden in some more.)


SELECT m.MessageId
FROM message m
WHERE
  NOT EXISTS
(SELECT
  EXISTS
(SELECT 1
FROM keylist tk
WHERE tk.KeylistId = t.ReadAccessKeylist AND some more)
  AS Allowed
FROM message_revision_tag mrt
  JOIN tag t USING (TagId)
WHERE mrt.MessageId = m.MessageId AND
  mrt.RevisionNumber = 1 AND
  t.ReadAccessKeylist IS NOT NULL
HAVING NOT Allowed)

My problem is that the sub-select in line 7 (SELECT 1) takes a 
rather long time. (When I remove it, it's much faster.) I'm not sure 
why, because there's not a single keylist in that table, however. 
Another issue is that this query should actually never be regarded. 
The condition in the second-last line is always false. A simple test 
confirms that:


SELECT COUNT(*) FROM tag WHERE ReadAccessKeylist IS NOT NULL
- 0

So there should never be a reason why the FROM in line 11 would result 
in a row (filtering out with the conditions, of course). But it still 
gets executed. When I make sure that the condition is always false, by 
adding a AND 0 just before the HAVING clause, the whole thing runs 
much faster.


I have no separate timings, but it's in the magnitude of 1 vs. 5-10 
milliseconds for the query. Run a hundred times makes a noticeable delay.


My understanding of it all was that first the FROM clause is regarded 
to see what rows there are. Then WHERE filters them, then SELECT will 
pick some columns (and thereby execute my sub-select expression) and 
finally HAVING filters again. Since in my theory there is no single 
row, SELECT has nothing to do. But obviously it has.


Some suggestion what's going on?



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Inefficient query processing?

Re: Inefficient query processing?

Re: Inefficient query processing?

Re: Inefficient query processing?

Re: Inefficient query processing?

Re: Inefficient query processing?

Re: Inefficient query processing?

Re: Inefficient query processing?

Re: Inefficient query processing?

Re: Inefficient query processing?

Re: Inefficient query processing?

Inefficient query processing?

Re: Inefficient query processing?

13 matches

Site Navigation

Mail list logo

Footer information