On 10/21/2010 9:57 AM, Ali A.F.N wrote:
Hi All,

I have a table with different fileds and almost the type of all them are
smallint. I want to search on some fields with "OR" condition. I want to know is
there possibility to know how many fileds matched exactly? then I can say how
many percent match are available.

select * from my_table where sex_id = 1 or country_id = 120 or education_id

I mean if in my table there are some records with sex_id = 1 or country_id =
120  then I got 2 (2 match) then I can say 66% percent match.

Thank you,

This is where having a name-value pair in your MySQL can help. Your data appears to belong to a dating site but it could easily belong to a product catalog or many other types of data sets. In this example,I want to compute product matches to see how close they are to my search criteria.

In rough symbolic terms, this is one layout that can help.

item table
=================
id
product_name
(other fields)

item qualities table
=================
quality_id
item_id
quality_name
quality_value

An index on (quality_name, quality_value, item_id) also comes in very handy right about now.

Let's say you wanted to look up all of the items that have color=blue, size=90cm, and flavor=peach (I really cannot imagine what this product may be, but after all this is only an example)

So, with a big wide table, you would need to either do something like

SELECT ... FROM old_style WHERE color='blue' and size='90cm' and flavor='peach';

But that would only find you an exact match. For partial matches, you would need to construct all sorts of partial queries. like

... WHERE color='blue'
... WHERE color='blue' and size='90cm'
... WHERE color=size='90cm'
... WHERE color=size='90cm' and flavor='peach'
...

and compare the results.

Using the new tables, you construct 3 union queries in your code and cache the results in a temporary table:

CREATE TEMPORARY TABLE tmp_relevance ENGINE=MEMORY
(SELECT item_id FROM item_qualities WHERE color='blue')
UNION ALL
(SELECT item_id FROM item_qualities WHERE size='90cm')
UNION ALL
(SELECT item_id FROM item_qualities WHERE flavor='peach');

Then you count up how often each item_id was matched:

SELECT item_id, count(*) as frequency FROM tmp_relevance GROUP BY item_id;

Could even modify that last step to check for a threshold of matching (say only those that match at least half of the terms you are looking for) with something that looks like this

SELECT item_id, count(*) as frequency FROM tmp_relevance GROUP BY item_id HAVING frequency/(#of terms in the search) >=0.5 ;

Of course, you know the value of (# of terms in the search) because that's how many union queries you needed to run.

You can improve on this technique in many ways. Here is one from the top of my head:

Instead of returning only an item_id in the first query, you can also return a quality rating. Let's say you were looking for something sized 90cm and you only have 88cm pieces in stock, that may return a match quality code of

1-(abs(90-88)/90)

You can combine that in the query against tmp_relevance to generate scores for near matches and not just exact partial matches.

Does this give you some ideas to build on?
--
Shawn Green
MySQL Principal Technical Support Engineer
Oracle USA, Inc.
Office: Blountville, TN

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql?unsub=arch...@jab.org

Reply via email to