Re: Big SELECT: ordering results by where matches are found

2007-09-11 Thread Chris Sansom

At 13:34 -0400 10/9/07, Baron Schwartz wrote:
Looks like you've found the solution you need.  The only other 
suggestion I have is to use UNION ALL if you don't need to eliminate 
duplicate rows in the UNION, because there's some overhead for 
checking for them.


Hi Baron

Thanks for this, and I did try it, but the difference in time taken 
to execute the query was negligible (I tested it multiple times) - it 
was around 0.02 seconds whichever way I did it, and when I used 
EXPLAIN, the results were identical except for one detail:


The number of rows in the first row of the EXPLAIN result was lower 
with plain UNION than if I used UNION ALL. As far as I can tell from 
my relatively limited experience with all this, the first row refers 
to my outer 'wrapper' select from the derived table (the table in the 
first row is given as 'derived2' and the Extra column shows 'Using 
temporary'). For a given query, with UNION ALL that has 45 rows, with 
UNION it's 31. So I guess I'll stick to plain UNION.


As far as my desire to cope with multiple search terms is concerned, 
I realise now that fulltext handles that anyway! So I've changed the 
few non-numeric fields that weren't indexed that way (fore, sur and 
topic) to fulltext and bingo! Not only that, but it all happens fully 
FOUR TIMES as quickly!


So many thanks, Baron - mainly due to you, yesterday was a very good 
MySQL day for me. It's not often I get two 'lightbulb moments' on the 
same day!


--
Cheers... Chris
Highway 57 Web Development -- http://highway57.co.uk/

Good people will do good things, and bad people will do bad things.
But for good people to do bad things - that takes religion.
   -- Steven Weinberg, physicist and Nobel Laureate

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Big SELECT: ordering results by where matches are found

2007-09-10 Thread Chris Sansom
I'm sure there must be an accepted technique for this, but it's 
something I haven't tried before, so if anyone can point me in the 
right direction I'd be grateful.


I'm writing a search facility for a site where the data is stored in 
several tables - let's say 5 for this example - and I want to order 
my results according to where (if anywhere) matches are found. So...


Let's say I have tables 'speakers', 'topics', 'speakers_topics', 
'articles', 'other'.

'speakers' is a table of speakers, with id, name and some text fields.
'topics' is a list of topics they address
'speakers_topics' relates the above two by pairs of id numbers
'articles' and 'other' are further tables of text data with possibly 
more than one row for some speakers, identified by id.


I want to search the data in the following order:
name from 'speakers'
topics
text data from 'speakers'
text data from 'articles' and 'other'
...and order the results according to where in that hierarchy a match is found.

So, if the user's search term matches one speaker's name field, 
another's topic and someone else's text data, that's the order in 
which the results should be ordered. Also, if the same person is 
matched from, say, both name and text fields (which is very likely, 
as their name will almost certainly appear in some of the text), the 
name should take precedence in the ordering.


To complicate matters further, I'd like if possible to extend this to 
an and/or situation. If the user enters two or more words, any 
results that match all the words should be ordered above those that 
match only some of the words.


I can probably do this relatively easily with a series of separate 
queries (I'm doing all this from PHP, by the way), but that strikes 
me as inefficient. Can it all be done in one big query, perhaps with 
subqueries?


--
Cheers... Chris
Highway 57 Web Development -- http://highway57.co.uk/

Revolution: an abrupt change in the form of misgovernment.
   -- Ambrose Bierce

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Big SELECT: ordering results by where matches are found

2007-09-10 Thread Baron Schwartz

Chris Sansom wrote:
I'm sure there must be an accepted technique for this, but it's 
something I haven't tried before, so if anyone can point me in the right 
direction I'd be grateful.


I'm writing a search facility for a site where the data is stored in 
several tables - let's say 5 for this example - and I want to order my 
results according to where (if anywhere) matches are found. So...


Let's say I have tables 'speakers', 'topics', 'speakers_topics', 
'articles', 'other'.

'speakers' is a table of speakers, with id, name and some text fields.
'topics' is a list of topics they address
'speakers_topics' relates the above two by pairs of id numbers
'articles' and 'other' are further tables of text data with possibly 
more than one row for some speakers, identified by id.


I want to search the data in the following order:
name from 'speakers'
topics
text data from 'speakers'
text data from 'articles' and 'other'
...and order the results according to where in that hierarchy a match is 
found.


So, if the user's search term matches one speaker's name field, 
another's topic and someone else's text data, that's the order in which 
the results should be ordered. Also, if the same person is matched from, 
say, both name and text fields (which is very likely, as their name will 
almost certainly appear in some of the text), the name should take 
precedence in the ordering.


To complicate matters further, I'd like if possible to extend this to an 
and/or situation. If the user enters two or more words, any results that 
match all the words should be ordered above those that match only some 
of the words.


I can probably do this relatively easily with a series of separate 
queries (I'm doing all this from PHP, by the way), but that strikes me 
as inefficient. Can it all be done in one big query, perhaps with 
subqueries?


I've built similar systems with a series of UNION queries.  Each UNION 
has a column for relevance, which can be a sum of CASE statements, 
such as


IF(name matches, 1, 0) + IF(text matches, 1, 0) AS relevance...

The entire UNION can then be ordered by relevance.  You could also just 
add in an arbitrary number in each UNION, to get the effect of ordering 
by where in the hierarchy the match is found.


Baron

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Big SELECT: ordering results by where matches are found

2007-09-10 Thread Chris Sansom
At 11:01 -0400 10/9/07, Baron Schwartz wrote:
I've built similar systems with a series of UNION queries.  Each UNION has a 
column for relevance, which can be a sum of CASE statements, such as

IF(name matches, 1, 0) + IF(text matches, 1, 0) AS relevance...

The entire UNION can then be ordered by relevance.  You could also just add in 
an arbitrary number in each UNION, to get the effect of ordering by where in 
the hierarchy the match is found.

Oo-er. This sounds marvellous, and I /think/ I see what you're getting at, but 
it's a bit beyond anything I've done before - never used UNION for instance. 
Can you perhaps go into a little more detail?

-- 
Cheers... Chris
Highway 57 Web Development -- http://highway57.co.uk/

The nice thing about standards is that there are so
many of them to choose from.
   -- Andrew S. Tanenbaum

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Big SELECT: ordering results by where matches are found

2007-09-10 Thread Chris Sansom

At 11:01 -0400 10/9/07, Baron Schwartz wrote:
The entire UNION can then be ordered by relevance.  You could also 
just add in an arbitrary number in each UNION, to get the effect of 
ordering by where in the hierarchy the match is found.


Actually, your pointing me towards UNION may have done the trick. I 
read up on it on the MySQL docs site and I've ended up with this, 
which actually covers more tables and fields than in my original post:


---

select distinct tb.speaker_id, tb.fore, tb.sur, tb.division from
(
(
select 1 as relevance, speaker_id, fore, sur, division
from speakers
where fore like '%education%' or sur like '%education%')
union
(
select 2 as relevance, s.speaker_id, fore, sur, division
from speakers s, speakers_topics st, topics t
where st.speaker_id = s.speaker_id and t.topic_id = st.topic_id and 
topic like '%education%'

)
union
(
select 3 as relevance, speaker_id, fore, sur, division
from speakers where match (strap, shortbio, longbio) against ('education')
)
union
(
select 4 as relevance, s.speaker_id, fore, sur, division
from speakers s, articles a
where s.speaker_id = a.speaker_id and match (title, article) against 
('education')

)
union
(
select 5 as relevance, s.speaker_id, fore, sur, division
from speakers s, other o
where s.speaker_id = o.speaker_id and match (title, article) against 
('education')

)
union
(
select 6 as relevance, speaker_id, fore, sur, division
from speakers, books
where speaker_id = author and match (title, description) against ('education')
)
order by relevance, division, sur, fore
) as tb

---

First, I did it without the outer select, and I got speakers repeated 
if they were matched in more than one block. One of the comments on 
the MySQL docs site suggested the 'wrapper', which I did initially 
like this:


select distinct speaker_id, fore, sur, division from... with nothing 
after the final ')'. This gave me an error to the effect that derived 
tables must always have an alias. What the hey, let's just try it 
like this (the above)... and to my astonishment it worked!


So before I sign off on this thread, can you see any way I could improve this?

Naturally, I haven't yet incorporated the treatment of more than one 
search term, but I'll try and work that out for myself. :-)


--
Cheers... Chris
Highway 57 Web Development -- http://highway57.co.uk/

Justice is incidental to law and order.
   -- J. Edgar Hoover

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Big SELECT: ordering results by where matches are found

2007-09-10 Thread Baron Schwartz



Chris Sansom wrote:

At 11:01 -0400 10/9/07, Baron Schwartz wrote:
The entire UNION can then be ordered by relevance.  You could also 
just add in an arbitrary number in each UNION, to get the effect of 
ordering by where in the hierarchy the match is found.


Actually, your pointing me towards UNION may have done the trick. I read 
up on it on the MySQL docs site and I've ended up with this, which 
actually covers more tables and fields than in my original post:


---

select distinct tb.speaker_id, tb.fore, tb.sur, tb.division from
(
(
select 1 as relevance, speaker_id, fore, sur, division
from speakers
where fore like '%education%' or sur like '%education%')
union
(
select 2 as relevance, s.speaker_id, fore, sur, division
from speakers s, speakers_topics st, topics t
where st.speaker_id = s.speaker_id and t.topic_id = st.topic_id and 
topic like '%education%'

)
union
(
select 3 as relevance, speaker_id, fore, sur, division
from speakers where match (strap, shortbio, longbio) against ('education')
)
union
(
select 4 as relevance, s.speaker_id, fore, sur, division
from speakers s, articles a
where s.speaker_id = a.speaker_id and match (title, article) against 
('education')

)
union
(
select 5 as relevance, s.speaker_id, fore, sur, division
from speakers s, other o
where s.speaker_id = o.speaker_id and match (title, article) against 
('education')

)
union
(
select 6 as relevance, speaker_id, fore, sur, division
from speakers, books
where speaker_id = author and match (title, description) against 
('education')

)
order by relevance, division, sur, fore
) as tb

---

First, I did it without the outer select, and I got speakers repeated if 
they were matched in more than one block. One of the comments on the 
MySQL docs site suggested the 'wrapper', which I did initially like this:


select distinct speaker_id, fore, sur, division from... with nothing 
after the final ')'. This gave me an error to the effect that derived 
tables must always have an alias. What the hey, let's just try it like 
this (the above)... and to my astonishment it worked!


So before I sign off on this thread, can you see any way I could improve 
this?


Naturally, I haven't yet incorporated the treatment of more than one 
search term, but I'll try and work that out for myself. :-)


Looks like you've found the solution you need.  The only other 
suggestion I have is to use UNION ALL if you don't need to eliminate 
duplicate rows in the UNION, because there's some overhead for checking 
for them.


Baron

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]