i have the following tables all threads adds jobs to the que and when one thread processes the url it is added to the fetched table i want to select the first record from the que which contains the given host and is not in the fetched table i use the following query to get the result but as tables get bigger this query takes 50 60 seconds to complete sometimes is there a way to reduce this time?

select id,url,anchor,pid from spider.que where id =
                                 (SELECT min(id) id FROM spider.Que
                                 where url like 'host %' and id >= 5
                                 and url not in (select url from spider.fetched)
                                 );


CREATE TABLE `Que` (
  `id` int(11) NOT NULL auto_increment,
  `url` text NOT NULL,
  `anchor` text,
  `pid` int(11) NOT NULL,
  PRIMARY KEY  (`id`),
  UNIQUE KEY `newindex` (`url`(250))
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COMMENT='index of urls for future sessions';


CREATE TABLE `fetched` (
  `id` int(11) default NULL,
  `url` text NOT NULL,
  `anchor` text,
  `content` text,
  `pid` int(11) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;



Nurullah Akkaya What lies behind us and what
[EMAIL PROTECTED]         lies before us are tiny matters
Registered Linux User #301438     compared to what lies within us.

WARNING all messages          "If at first an idea is not
containing attachments             absurd, there is no hope for it"
or html will be silently               Albert Einstein
deleted. Send only
plain text.

Because the people who are crazy enough to think
they can change the world, are the ones who do.....

Reply via email to