Thanks to the guidance of an answer submitted by LucaM on the  
StackOverflow post, I was able to come up with a SQL query and  
ActiveRecord find query that does it.

SQL:

SELECT c.park_id, a.name FROM parks a, features b, features_parks c  
WHERE (c.park_id = a.id) AND (c.feature_id = b.id) AND (c.feature_id  
IN (21, 51, 15, 24)) GROUP BY a.id HAVING count(*)=4;

Rails:

       feature_ids = [21, 51, 15, 24]
       @parks = Park.all(
                     :joins => :features,
                     :group => "parks.id",
                     :select => "parks.*",
                     :conditions => ["features.id in (?)", feature_ids],
                     :having => "count(*)=#{feature_ids.size}"
                )

The key was grouping and using having to ensure that the items each  
had the max number of features that we're searching on.

I hope this helps someone.

-Kevin


On Jul 8, 2009, at 12:22 PM, Kevin Elliott wrote:

> To further show an example of trying to do this, here is a 'real  
> world' example. However, in this particular case, I'm only showing  
> two features, but in real usage, a user might select as many as 50  
> features or so.
>
> mysql> SELECT c.park_id, a.name, c.feature_id, b.name FROM parks a,  
> features b, features_parks c WHERE (c.park_id = a.id) AND  
> (c.feature_id = b.id) AND (c.feature_id = 21);
> +---------+-------------------------+------------+--------+
> | park_id | name                    | feature_id | name   |
> +---------+-------------------------+------------+--------+
> |    3120 | Mitchell Park           |         21 | Swings |
> |   21385 | Boulware Park           |         21 | Swings |
> |    2351 | Bowden Park             |         21 | Swings |
> |   21561 | Cameron Park            |         21 | Swings |
> |   24791 | Ramos Park              |         21 | Swings |
> |    2585 | Eleanor Park            |         21 | Swings |
> |    2627 | Flood County Park       |         21 | Swings |
> |    2374 | Burgess Park            |         21 | Swings |
> |   22986 | Holbrook-Palmer Park    |         21 | Swings |
> |    2799 | Huddart Park            |         21 | Swings |
> |    3354 | Rinconada Park          |         21 | Swings |
> |    3693 | Werry Park              |         21 | Swings |
> |   26152 | Willow Oaks Park        |         21 | Swings |
> |   23559 | Live Oak Manor Park     |         21 | Swings |
> |   24295 | Oak Meadow Park         |         21 | Swings |
> |   25562 | Sunnyvale Baylands Park |         21 | Swings |
> |   21354 | Cornelis Bol Park       |         21 | Swings |
> |   21459 | Bubb Park               |         21 | Swings |
> |   21910 | Cooper Park             |         21 | Swings |
> |   24859 | Rengstorff Park         |         21 | Swings |
> |   24043 | Monroe Mini Park        |         21 | Swings |
> |   24867 | Rex Manor Playground    |         21 | Swings |
> +---------+-------------------------+------------+--------+
> 22 rows in set (0.00 sec)
>
>
> mysql> SELECT c.park_id, a.name, c.feature_id, b.name FROM parks a,  
> features b, features_parks c WHERE (c.park_id = a.id) AND  
> (c.feature_id = b.id) AND (c.feature_id = 51);
> +---------+--------------------------+------------+---------+
> | park_id | name                     | feature_id | name    |
> +---------+--------------------------+------------+---------+
> |    3120 | Mitchell Park            |         51 | Sandpit |
> |    2351 | Bowden Park              |         51 | Sandpit |
> |    2374 | Burgess Park             |         51 | Sandpit |
> |    2474 | Coyote Point County Park |         51 | Sandpit |
> |   24867 | Rex Manor Playground     |         51 | Sandpit |
> +---------+--------------------------+------------+---------+
> 5 rows in set (0.00 sec)
>
>
> So if I was trying to fetch all parks that have BOTH 'Swings' and a  
> 'Sandpit', I'd expect to see:
>
>    Mitchell Park
>    Bowden Park
>    Burgess Park
>    Rex Manor Playground
>
> If this was limited to only looking for 2 or 3 features at a time,  
> even on a hefty DB of 300,000 records, this wouldn't be too  
> difficult. You would fetch all parks for each of the 2 or 3  
> features, then eliminate all but the ones they share in common  
> (exclusionary merging of some sort). However, with 50 features, this  
> becomes unusable, since you'd have tons of redundancies when  
> retrieving the data, and you'd have to recursively iterate over 50  
> features.
>
> Any way to combine the queries into one, and subquery it to  
> eliminate ones that don't have all the features requested in the  
> query?
>
> -Kevin
>
> On Jul 7, 2009, at 5:44 PM, Kevin Elliott wrote:
>
>> I have two models, associated with a HABTM (actually using
>> has_many :through on both ends, along with a join table). I need to
>> retrieve all ModelA's that is associated with BOTH of two ModelB's. I
>> do NOT want all ModelA's for ModelB_1 concatenated with all ModelA's
>> for ModelB_2. I literally want all ModelA's that are associated with
>> BOTH ModelB_1 and ModelB_2. It is not limited to only 2 ModelB's, it
>> may be up to 50 ModelB's, so this must scale.
>>
>> I can describe the problem using a variety of analogies, that I think
>> better describes my problem than the previous paragraph:
>>
>>      * Find all books that were written by all 3 authors together.
>>      * Find all movies that had the following 4 actors in them.
>>      * Find all blog posts that belonged to BOTH the Rails and Ruby
>> categories for each post.
>>      * Find all users that had all 5 of the following tags: funny,
>> thirsty, smart, thoughtful, and quick.   (silly example!)
>>      * Find all people that have worked in both San Francisco AND San  
>> Jose
>> AND New York AND Paris in their lifetimes.
>>
>> I've thought of a variety of ways to accomplish this, but they're
>> grossly inefficient and very frowned upon.
>>
>> Taking an analogy above, say the last one, you could do something  
>> like
>> query for all the people in each city, then find items in each array
>> that exist across each array. That's a minimum of 5 queries, all the
>> data of those queries transfered back to the app, then the app has to
>> intensively compare all 5 arrays to each other (loops galore!).  
>> That's
>> nasty, right?
>>
>> Another possible solution would be to chain the finds on top of each
>> other, which would essentially do the same as above, but won't
>> eliminate the multiple queries and processing. Also, how would you
>> dynamicize the chain if you had user submitted checkboxes or values
>> that could be as high as 50 options? Seems dirty. You'd need a loop.
>> And again, that would intensify the search duration.
>>
>> Obviously, if possible, we'd like to have the database perform this
>> for us, so, people have suggested to me that I simply put multiple
>> conditions in. Unfortunately, you can only do an OR with HABTM
>> typically.
>>
>> Another solution I've run across is to use a search engine, like
>> sphinx or UltraSphinx. For my particular situation, I feel this is
>> overkill, and I'd rather avoid it. I still feel there should be a
>> solution that will let a user craft a query for an arbitrary number  
>> of
>> ModelB's and find all ModelA's.
>>
>>
>> How would *you* solve this problem?
>>
>>
>> Thanks a bunch,
>> Kevin
>>
>> (I've subsequently cross-posted this to StackOverflow's website since
>> they have a target audience that encompasses more than Rails, and is
>> still valid in those areas --- 
>> http://stackoverflow.com/questions/1095571/habtm-finds-with-and-joins-not-or
>>  )
>
> >


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Ruby 
on Rails: Talk" group.
To post to this group, send email to rubyonrails-talk@googlegroups.com
To unsubscribe from this group, send email to 
rubyonrails-talk+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/rubyonrails-talk?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to