Problems with boolean keyword search

2004-06-17 Thread Andreas Ahlenstorf
Hello,

I have to build a keyword search with support for the boolean
operators +, -, * and .

Basically it's quite easy: Connect the three tables with two
joins, keyword search with LIKE. But the boolean operators give
me a headache.

mysql SELECT c.id FROM keywords AS a LEFT JOIN keywords_transfer AS b
- ON a.id = b.keyword LEFT JOIN images AS c ON b.series = c.series
- AND b.image = c.image WHERE a.word LIKE 'costa brava' AND a.word LIKE
- 'Ziegenkäse' AND a.word NOT LIKE 'Dörfer';
Empty set (0.00 sec)

That was my first idea. It should return one row. It's clear that
it does not, because a.word contains only one keyword.

Second idea was to fetch all the images which are (not) matching
the query and calculate the intersection.

SELECT id, image, series, title, variant_t FROM images WHERE 1 OR
(image = '001' and series = '07001') OR (image = '002' and series
= '07001') OR (image = '003' and series = '07001') OR (image =
'004' and series = '07001') OR (image = '006' and series =
'07001') OR (image = '007' and series = '07001') OR (image =
'008' and series = '07001') OR (image = '009' and series =
'07001') OR (image = '010' and series = '07001') OR (image =
'011' and series = '07001') OR (image = '012' and series =
'07001') OR (image = '013' and series = '07001') OR (image =
'014' and series = '07001') OR (image = '015' and series =
'07001') OR (image = '016' and series = '07001') OR (image =
'017' and series = '07001') OR (image = '018' and series =
'07001') OR (image = '019' and series = '07001') OR (image =
'020' and series = '07001') OR (image = '021' and series = '07001')

If the search is only matching a few fields, it works quiet well,
but with a bigger result set it's very slow. Quering the table
using WHERE image IN (...) and series IN (...) is not possible
because it would fetch to much images if the keywords are
matching two different values for series.

Has anyone a hint for me to get that working at a good speed?

Regards,
A.

P.S: I have to use the fields series and image insted of the
 primary keys to link the tables. If I don't do that, I have
 problems with the data integriy because of the import
 scripts.
--
mysql SELECT * FROM keywords LIMIT 5;
++-++
| id | word| amount |
++-++
|  1 | Firence | 49 |
|  2 | Italien | 49 |
|  3 | Toscana | 49 |
|  4 | Florenz | 59 |
|  5 | ESF | 29 |
++-++
5 rows in set (0.00 sec)

mysql SELECT * FROM keywords_transfer LIMIT 5;
+++---+-+
| id | series | image | keyword |
+++---+-+
|  1 |  00026 |   001 |   1 |
|  2 |  00026 |   001 |   2 |
|  3 |  00026 |   001 |   3 |
|  4 |  00026 |   001 |   4 |
|  5 |  00026 |   001 |   5 |
+++---+-+
5 rows in set (0.00 sec)

mysql SELECT id, series, image FROM images LIMIT 5;
+++---+
| id | series | image |
+++---+
|  1 |  00026 |   001 |
|  2 |  00026 |   002 |
|  3 |  00026 |   003 |
|  4 |  00026 |   004 |
|  5 |  00026 |   005 |
+++---+
5 rows in set (0.00 sec)



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Problems with boolean keyword search

2004-06-17 Thread Eamon Daly
FYI, MySQL allows boolean searching as of 4.0.1. See:

http://dev.mysql.com/doc/mysql/en/Fulltext_Boolean.html

Or pp. 256-257 of MySQL, 2nd Ed. by Paul DuBois for several
examples.


Eamon Daly



- Original Message - 
From: Andreas Ahlenstorf [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Thursday, June 17, 2004 11:40 AM
Subject: Problems with boolean keyword search


 Hello,

 I have to build a keyword search with support for the boolean
 operators +, -, * and .

 Basically it's quite easy: Connect the three tables with two
 joins, keyword search with LIKE. But the boolean operators give
 me a headache.

 mysql SELECT c.id FROM keywords AS a LEFT JOIN keywords_transfer AS b
 - ON a.id = b.keyword LEFT JOIN images AS c ON b.series = c.series
 - AND b.image = c.image WHERE a.word LIKE 'costa brava' AND a.word
LIKE
 - 'Ziegenkäse' AND a.word NOT LIKE 'Dörfer';
 Empty set (0.00 sec)

 That was my first idea. It should return one row. It's clear that
 it does not, because a.word contains only one keyword.

 Second idea was to fetch all the images which are (not) matching
 the query and calculate the intersection.

 SELECT id, image, series, title, variant_t FROM images WHERE 1 OR
 (image = '001' and series = '07001') OR (image = '002' and series
 = '07001') OR (image = '003' and series = '07001') OR (image =
 '004' and series = '07001') OR (image = '006' and series =
 '07001') OR (image = '007' and series = '07001') OR (image =
 '008' and series = '07001') OR (image = '009' and series =
 '07001') OR (image = '010' and series = '07001') OR (image =
 '011' and series = '07001') OR (image = '012' and series =
 '07001') OR (image = '013' and series = '07001') OR (image =
 '014' and series = '07001') OR (image = '015' and series =
 '07001') OR (image = '016' and series = '07001') OR (image =
 '017' and series = '07001') OR (image = '018' and series =
 '07001') OR (image = '019' and series = '07001') OR (image =
 '020' and series = '07001') OR (image = '021' and series = '07001')

 If the search is only matching a few fields, it works quiet well,
 but with a bigger result set it's very slow. Quering the table
 using WHERE image IN (...) and series IN (...) is not possible
 because it would fetch to much images if the keywords are
 matching two different values for series.

 Has anyone a hint for me to get that working at a good speed?

 Regards,
 A.

 P.S: I have to use the fields series and image insted of the
  primary keys to link the tables. If I don't do that, I have
  problems with the data integriy because of the import
  scripts.
 --
 mysql SELECT * FROM keywords LIMIT 5;
 ++-++
 | id | word| amount |
 ++-++
 |  1 | Firence | 49 |
 |  2 | Italien | 49 |
 |  3 | Toscana | 49 |
 |  4 | Florenz | 59 |
 |  5 | ESF | 29 |
 ++-++
 5 rows in set (0.00 sec)

 mysql SELECT * FROM keywords_transfer LIMIT 5;
 +++---+-+
 | id | series | image | keyword |
 +++---+-+
 |  1 |  00026 |   001 |   1 |
 |  2 |  00026 |   001 |   2 |
 |  3 |  00026 |   001 |   3 |
 |  4 |  00026 |   001 |   4 |
 |  5 |  00026 |   001 |   5 |
 +++---+-+
 5 rows in set (0.00 sec)

 mysql SELECT id, series, image FROM images LIMIT 5;
 +++---+
 | id | series | image |
 +++---+
 |  1 |  00026 |   001 |
 |  2 |  00026 |   002 |
 |  3 |  00026 |   003 |
 |  4 |  00026 |   004 |
 |  5 |  00026 |   005 |
 +++---+
 5 rows in set (0.00 sec)



 -- 
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Problems with boolean keyword search

2004-06-17 Thread Andreas Ahlenstorf
Hello,

Eamon Daly schrieb am Donnerstag, 17. Juni 2004 um 19:04:

 FYI, MySQL allows boolean searching as of 4.0.1. See:
 http://dev.mysql.com/doc/mysql/en/Fulltext_Boolean.html

I normally use that. But: Very bad result quality because there
are only sigle keywords. I tried it with a text column containing
all the keywords. Unfortunately not better.

Regards,
   A.











-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]