Joe Rhett wrote:


mysql> select Notice_ID from Notices where match (Text) against ('+pollution +control' in boolean mode);
Empty set (0.00 sec)


mysql> select Notice_ID from Notices where match (Text) against ('"pollution control"' in boolean mode);
Empty set (0.02 sec)


mysql> select Notice_ID from Notices where Text like '%pollution control%';
+-----------+
| Notice_ID |
+-----------+
| 192090 |
+-----------+
1 row in set (5.00 sec)

Your LIKE query is not equivalent to your MATCH AGAINST. For example, Text containing "pollution controls" would match LIKE '%pollution control%', but would not MATCH AGAINST ('+pollution +control'...). Have you looked at Text for this row to be sure it contains exactly "pollution" and "control"?

No, it isn't equivalent but it would match less than the AND would,
since it is an exact phrase. Yes, those words and that exact phrase exists multiple times. It has been confirmed as a bug.


mysql> select Notice_ID from Notices where match (Text) against ('+pollution +air' in boolean mode);
Empty set (0.03 sec)


mysql> select Notice_ID from Notices where match (Text) against ('"air pollution"' in boolean mode);
Empty set (0.00 sec)


mysql> select Notice_ID from Notices where Text like '%air pollution%';
+-----------+
| Notice_ID |
+-----------+
|    196349 |
|    196569 |
|    188183 |
|    192090 |
|    192686 |
|    199283 |
+-----------+
6 rows in set (0.17 sec)

(NOTE on the search for air -- my.cnf has ft_min_word_len=3)

Are you certain that air is indexed? What does


SELECT COUNT(*) FROM Notices WHERE MATCH (Text) AGAINST ('air');

return?


Just over 6 thousand results. And it was answered just below, here:


All OR searches work perfectly fine, as per:

mysql> select count(*) from Notices where match (Text) against ('air pollution');
+----------+
| count(*) |
+----------+
| 100 |
+----------+
1 row in set (0.03 sec)

How many do you get with


SELECT COUNT(*) FROM Notices WHERE MATCH (Text) AGAINST ('pollution');

If air isn't indexed, I'd expect 0 hits for 'air' and 100 hits for 'pollution'.

17 hits for pollution.


Do you normally talk down to people as if they are too retarted to have
tested the obvious themselves?  This wasn't a complaint or a how-to
question, but a researched and tested bug.  I specifically demonstrated
that the phrases exist and that clearly stated that the OR searches worked.

I'm sorry. It was not my intention to talk down to you or otherwise give offense, but given your obvious frustration with this problem, I chose my words poorly. I apologize.


I wrote my reply before Sergei recognized the problem and told you how to fix it (and even he asked for more info). While you may have done days of research, we on the list only had the evidence you gave to go on. It is important when troubleshooting not to make assumptions. As you only provided query results, not actual data, I was hoping to elicit some additional info to help with the diagnosis.

For example, you gave a query using LIKE to demonstrate that there should be at least one match for your first MATCH AGAINST, but your LIKE test is not a subset of your MATCH AGAINST. You are correct that LIKE will only return the subset of returns from MATCH AGAINST that are exact phrases, but it will also return things which MATCH AGAINST will not. I gave one examaple, namely a phrase containing "pollution controls". The 's' at the end would satisfy the % of the LIKE test but 'controls' is not 'control' to MATCH AGAINST. You may know that there are many rows which should have matched, but you didn't show us one.

In the case of 'air pollution', I must admit I failed to say what I was getting at. There have been questions before where mysql was not using ft_min_word_len=3 (something as trivial as putting it in the wrong section in my.cnf, for example) as intended. I thought it important to rule that out. Knowing that OR searches work does not rule that out. Try it:

  SELECT COUNT(*) FROM Notices
  WHERE MATCH (Text) AGAINST ('no pollution');

should return the same 17 rows as the simple MATCH AGAINST ('pollution') did. The word 'no' is too short, isn't indexed, and is thus ignored in your query. If ft_min_word_len had not been successfully set to 3, 'air' would have been ignored. In this message you provided the fact that a MATCH AGAINST 'air' returns over 6 thousand hits, which clearly shows that ft_min_word_len is set to 3, as you intended. Again, while you had probably already verified that air was indexed, you didn't show us.

I did not take your message to be "a complaint or a how-to question", but a real problem to be solved. In rereading my message, I can see how someone, especially someone already frustrated with the problem, could take my questions as a suggestion that you didn't know what you were doing. I didn't mean to do that, and I'm sorry I wrote it that way. But, to be fair, you must admit that for all your research, this was not "confirmed as a bug". Sergei's correct (so far as I know) solution was that you missed a step in the upgrade from 3.23.x to 4.0.x.

Michael




-- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]



Reply via email to