Hi

 

I'm running a small search engine that indexes the web, but have been having
trouble with optimising it to handle the load.

There are two tables involved in searching - `pages`, which stores the ID,
URL, title and crawl date, and `words` which has two rows - `word` and
`page` where `word` is a word found on the page and `page` is the ID from
the `pages` table.

 

When we crawl an URL we rip all the words from the page and add them to the
`words` table with the ID of the page.

 

The query we use for searches is:

 

SELECT COUNT(words.word) AS score, words.page AS id, pages.title, pages.url
FROM words,pages WHERE pages.id=words.page AND words.word IN($words) GROUP
BY words.page ORDER BY score DESC LIMIT 10

 

I've put the LIMIT 10 in there because it's been going slow as hell. not
only that but it's still going rather slow since we're getting rather high
load on the search engine at the moment.

 

If anyone could suggest ways to make it run faster that'd be great, bearing
in mind that:

 

a)      I can't change MySQL server parameters since the host won't allow it

b)      I'd rather not start crawling again with a different method - the
words table has over 1,700,000 rows.

 

Thanks

> Jasper Bryant-Greene
> Cabbage Promotions
>  <mailto:[EMAIL PROTECTED]> [EMAIL PROTECTED]
>  <http://fatalnetwork.com/> http://fatalnetwork.com/
> US: +1 (509) 691 3287
> NZ: +64 (21) 232 3303

Reply via email to