On Tue, Dec 9, 2014 at 4:18 PM, Robert DiFalco <robert.difa...@gmail.com> wrote:
> I'm sorry, I missed a JOIN on the second variation. It is: > > SELECT u.id, u.name, u.imageURL, u.bio, > CASE > WHEN f.friend_id IS NOT NULL THEN 'isFriend' > WHEN s.to_id IS NOT NULL THEN 'hasSentRequest' > WHEN r.to_id IS NOT NULL THEN 'hasReceivedRequest' > ELSE 'none' > END AS 'friendStatus', > (SELECT COUNT(1) AS d > FROM friends f1 > JOIN friends f2 ON f1.fiend_id = f2.friend_id > WHERE f1.user_id = 33 AND f2.user_id = u.id) > FROM users u > *LEFT OUTER JOIN friends f ON f.user_id = 33 AND f.friend_id = u.id > <http://u.id>* > LEFT OUTER JOIN friend_requests s ON s.to_id = 33 AND s.from_id = u.id > LEFT OUTER JOIN friend_requests r ON r.to_id = u.id AND r.from_id = 33 > WHERE u.id != 33 AND u.name LIKE '%John%' ORDER BY u.name; > > > On Tue, Dec 9, 2014 at 10:15 AM, Robert DiFalco <robert.difa...@gmail.com> > wrote: > >> I have users, friends, and friend_requests. I need a query that >> essentially returns a summary containing: >> >> * user (name, imageURL, bio, ...) >> * Friend status (relative to an active user) >> * Is the user a friend of the active user? >> * Has the user sent a friend request to the active user? >> * Has the user received a friend request from the active user? >> * # of mutualFriends >> * Exclude the active user from the result set. >> >> So I have mocked this up two ways but both have complicated query plans >> that will be problematic with large data sets. I'm thinking that my lack of >> deep SQL knowledge is making me miss the obvious choice. >> >> Here's my two query examples: >> >> SELECT u.id, u.name, u.imageURL, u.bio, >> CASE >> WHEN EXISTS(SELECT 1 FROM friends f WHERE f.user_id = 33 AND >> f.friend_id = u.id) THEN 'isFriend' >> WHEN EXISTS(SELECT 1 FROM friend_requests s WHERE s.to_id = 33 >> AND s.from_id = u.id) THEN 'hasSentRequest' >> WHEN EXISTS(SELECT 1 FROM friend_requests r WHERE r.to_id = u.id >> AND r.from_id = 33) THEN 'hasReceivedRequest' >> ELSE 'none' >> END AS "friendStatus", >> (SELECT COUNT(1) >> FROM friends f1 >> JOIN friends f2 ON f1.friend_id = f2.friend_id >> WHERE f1.user_id = 33 AND f2.user_id = u.id) AS mutualFriends >> FROM users u >> WHERE u.id != 33 AND u.name LIKE 'John%' ORDER BY u.name; >> >> SELECT u.id, u.name, u.imageURL, u.bio, >> CASE >> WHEN f.friend_id IS NOT NULL THEN 'isFriend' >> WHEN s.to_id IS NOT NULL THEN 'hasSentRequest' >> WHEN r.to_id IS NOT NULL THEN 'hasReceivedRequest' >> ELSE 'none' >> END AS 'friendStatus', >> (SELECT COUNT(1) AS d >> FROM friends f1 >> JOIN friends f2 ON f1.fiend_id = f2.friend_id >> WHERE f1.user_id = 33 AND f2.user_id = u.id) >> FROM users u >> LEFT OUTER JOIN friend_requests s ON s.to_id = 33 AND s.from_id = u.id >> LEFT OUTER JOIN friend_requests r ON r.to_id = u.id AND r.from_id = 33 >> WHERE u.id != 33 AND u.name LIKE 'John%' ORDER BY u.name; >> >> 33 is just the id of the active user I am using for testing. The WHERE >> clause could be anything. I'm just using "u.name" here but I'm more >> concerned about the construction of the result set than the WHERE clause. >> These have more or less similar query plans, nothing that would change >> things factorially. Is this the best I can do or am I missing the obvious? >> >> Here are the tables: >> >> >> CREATE TABLE users ( >> id BIGINT, >> name VARCHAR, >> imageURL VARCHAR >> created TIMESTAMP DEFAULT CURRENT_TIMESTAMP, >> phone_natl BIGINT, /* National Phone Number */ >> country_e164 SMALLINT, /* E164 country code */ >> email VARCHAR(255), >> PRIMARY KEY (id), >> UNIQUE (email), >> UNIQUE (phone_natl, country_e164) >> ); >> >> >> CREATE TABLE friends ( >> user_id BIGINT, >> friend_id BIGINT, >> PRIMARY KEY (user_id, user_id), >> FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE, >> FOREIGN KEY (friend_id) REFERENCES users(id) ON DELETE CASCADE >> ); >> CREATE INDEX idx_friends_friend ON friends(friend_id); >> >> CREATE TABLE friend_requests ( >> from_id BIGINT, >> to_id BIGINT, >> created TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP, >> PRIMARY KEY (from_id, user_id), >> FOREIGN KEY (from_id) REFERENCES users(id) ON DELETE CASCADE, >> FOREIGN KEY (to_id) REFERENCES users(id) ON DELETE CASCADE >> ); >> CREATE INDEX idx_friend_requests_to ON friend_requests(to_id); >> >> Let me know if you guys need anything else. >> >> > Hello Robert, none of your schemas worked for me, here's a clean version CREATE TABLE users ( id BIGINT, name VARCHAR, imageURL VARCHAR, created TIMESTAMP DEFAULT CURRENT_TIMESTAMP, phone_natl BIGINT, country_e164 SMALLINT, email VARCHAR(255), PRIMARY KEY (id), UNIQUE (email), UNIQUE (phone_natl, country_e164) ); CREATE TABLE friends ( user_id BIGINT, friend_id BIGINT, PRIMARY KEY (user_id, friend_id), FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE, FOREIGN KEY (friend_id) REFERENCES users(id) ON DELETE CASCADE ); CREATE INDEX idx_friends_friend ON friends(friend_id); CREATE TABLE friend_requests ( from_id BIGINT, to_id BIGINT, created TIMESTAMP DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (from_id, to_id), FOREIGN KEY (from_id) REFERENCES users(id) ON DELETE CASCADE, FOREIGN KEY (to_id) REFERENCES users(id) ON DELETE CASCADE ); CREATE INDEX idx_friend_requests_to ON friend_requests(to_id); I may look into the query later but here's some thoughts * you want an index to speed up name searches on user.name, I suggest pg_trgm + ILIKE as a starting point. * you really expect more than a billion users? If not (or not in the near future) use INT instead. This will save you a significant amount of memory and eventually buy you time later. * as long as you don't hit the disk for the queries you'll be fine, so make sure you have enough memory or use read-slaves with smaller working sets.