> -----Original Message-----
> From: Daevid Vincent [mailto:[EMAIL PROTECTED] 
> Sent: Friday, May 04, 2007 1:22 AM
> To: mysql@lists.mysql.com
> Subject: How do I find products when a user types freeform 
> strings like 'Sony 20" TV' or '20" Sony TV'?
> 
> I'm having trouble figuring out the logic/query I want.
> I know that all those "OR"s are not right.
> I'm doing this in PHP and mySQL (of course), 
> so if it can't be done with a single query, I can split it up.
> 
> Here's the challenge, given a text field search box, someone enters:
> 
>       Sony 20" TV
> 
> How do I search for that, not knowing which fields are which?
> For example, they could have also entered:
> 
>       20" Sony TV
> 
> This is the one I have now, but (as you probably noticed), it 
> will return many rows,
> I expect that most of the time > 1 row will be returned, but 
> I'm getting a grip more than I want (or the customer would want), and
> also rows that have nothing to do with the search terms.
> 
> SELECT products.*, companies.name AS company_name, 
> categories.name AS category_name 
> FROM   products 
>        LEFT JOIN companies ON company_id = companies.id 
>        LEFT JOIN categories ON category_id = categories.id 
> WHERE  products.enabled = 1 
>       AND( 
>           (products.model LIKE 'sony%'   OR products.model 
> LIKE '20%'  OR products.model LIKE 'tv%') 
>        OR (products.upc LIKE 'sony'      OR products.upc LIKE 
> '20'     OR products.upc LIKE 'tv') 
>        OR (products.name LIKE '%sony%'   OR products.name 
> LIKE '20%'   OR products.name LIKE '%tv%') 
>        OR (companies.name LIKE 'sony%'   OR companies.name 
> LIKE '20%'  OR companies.name LIKE 'tv%') 
>        OR (categories.name LIKE '%sony%' OR categories.name 
> LIKE '20%' OR categories.name LIKE '%tv%') 
>       ) 
> ORDER BY categories.name DESC, products.name ASC, companies.name ASC;
> 
> (and that just gets uglier the more words in the search)
> 
> +----+------------------+--------------+--------------+-------
> --------+
> | id | name             | model        | company_name | 
> category_name |
> +----+------------------+--------------+--------------+-------
> --------+
> |  1 | 20" TV           | STV20-KVR-HD | Sony         | Tube  
>         | <---
> |  2 | 36" TV           | STV36-KVR-HD | Sony         | Tube  
>         | 
> |  4 | Ultra-Plasma 62" | UP62F900     | Sony         | 
> Plasma        | 
> |  5 | Value Plasma 38" | VPR542_38    | Sony         | 
> Plasma        | 
> |  6 | Power-MP3 5gb    | 09834wuw34   | Sony         | MP3 
> Players   | 
> |  3 | Super-LCD 42"    | SLCD42hd002  | Sony         | LCD   
>         | 
> |  7 | Super-Player 1gb | SP1gb        | Sony         | Flash 
>         | 
> |  8 | Porta CD         | pcd500       | Sony         | CD 
> Players    | 
> ......
> +----+------------------+--------------+--------------+-------
> --------+
> 
> Obviously the person wanted id = 1 in this case.
> 
> Unrelated, is there any speed improvement using JOIN instead 
> of LEFT JOIN ?
> Think millions of products.
> 
> Thanks for help and suggestions...
> 
> Daevid.

I'll attach a .php file, but this list server may strip it off, so I'll also 
paste it below, sorry for any formatting issues in
advance... 

<?php
if ($_POST['keywords'])
{
        $_POST['keywords'] = stripslashes($_POST['keywords']);
        $words = preg_split("/\s+/",$_POST['keywords'], -1, 
PREG_SPLIT_NO_EMPTY);
}

$sql = 'SELECT  products.* FROM product_table WHERE 1 ';
$sql .= keyword_filter($words, array('products.model%', 'products.upc', 
'%products.name%', 'companies.name%', '%categories.name%'),
true);
$sth = SQL_QUERY($sql);

/**
* Builds the WHERE portion of a SQL statement using the keywords in various 
columns with wildcard support.
* 
* @return       string SQL statement fragment
* @param        mixed $words either a string of words space deliminated or an 
array of words
* @param        array $columns an array of table.column names to search the 
$words in. Use % as a wildcard for example pass in
'username%' or '%username%'.
* @param        boolean $and (true) whether the words have to be ANDed or ORed 
together.
* @author       Daevid Vincent [EMAIL PROTECTED]
* @since        1.0
* @version      1.4
* @date         05/10/07
* @todo         This should handle +, - and "" just like google or yahoo or 
other search engines do.
*/
function keyword_filter($words, $columns, $and = true)
{
        // this maybe useful
        // 
http://wiki.ittoolbox.com/index.php/Code:Translate_Boolean_Query_to_SQL_select_statement
        // http://www.ibiblio.org/adriane/queries/
        // 
http://www.zend.com/zend/tut/tutorial-ferrara1.php?article=tutorial-ferrara1&kind=t&id=8238&open=1&anc=0&view=1
        
        // 
http://evolt.org/article/Boolean_Fulltext_Searching_with_PHP_and_MySQL/18/15665/index.html
        // http://www.databasejournal.com/features/mysql/article.php/3512461
        
        // this would be great, but the dumb-asses don't work with InnoDB 
tables. GRRR!
        // http://dev.mysql.com/doc/refman/5.0/en/fulltext-boolean.html
        //$sql .= " AND MATCH (".implode(',',$columns).") AGAINST ('".implode(' 
',$words)."' IN BOOLEAN MODE)";
        
        if (!is_array($columns) or !$words) return;
        
        if (is_string($words))
                $words = preg_split("/\s+/",$words, -1, PREG_SPLIT_NO_EMPTY);
        
        if(count($words) < 1) return '';
        
        if ($and) //AND the words together
        {
                $sql = " AND ";
                $sqlArray = array();
                foreach($words as $word)
                {
                        $tmp = array();
                        foreach($columns as $field)
                        {
                                $col = str_replace('%','',$field);
                                //[dv] read the http://php.net/preg_replace 
carefully. You must use this format, 
                                //         because otherwise $words that are 
digits will cause undesired results.
                                $myword = preg_replace("/(%)?([\w\.]+)(%)?/", 
"\${1}".$word."\${3}", $field );
                                $tmp[] = $col." LIKE '".SQL_ESCAPE($myword)."'";
                        }
                        $sqlArray[] = " (".implode(" OR ",$tmp).") ";
                }
                $sql .= implode(" AND ", $sqlArray);
        }
        else //OR the words together
        {
                $sql = " AND ( ";
                $sqlArray = array();
                foreach($columns as $field)
                {
                        $col = str_replace('%','',$field);
                        
                        $tmp = array();
                        foreach($words as $word)
                        {
                                //[dv] read the http://php.net/preg_replace 
carefully. You must use this format, 
                                //         because otherwise $words that are 
digits will cause undesired results.
                                $myword = preg_replace("/(%)?([\w\.]+)(%)?/", 
"\${1}".$word."\${3}", $field );
                                $tmp[] = $col." LIKE '".SQL_ESCAPE($myword)."'";
                        }
                        $sqlArray[] = "(".implode(" OR ",$tmp).") ";
                }
                $sql .= implode(" OR ", $sqlArray);
                $sql .= ") ";
        }
        
        return $sql;
}
?>

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to