The structure of the tables involved are as follows:
CREATE TABLE `listing_impression` (
`id` bigint(20) NOT NULL auto_increment,
`listing_id` bigint(20) NOT NULL,
`url` varchar(255) NOT NULL,
`timestamp` timestamp NOT NULL default CURRENT_TIMESTAMP,
`listing_type` enum('full','free') NOT NULL default 'full',
PRIMARY KEY (`id`),
KEY `listing_id` (`listing_id`),
KEY `from` (`url`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=27527384 ;
CREATE TABLE `listing` (
`id` bigint(20) unsigned NOT NULL auto_increment,
`category_id` bigint(20) unsigned NOT NULL default '0',
`company_id` bigint(20) unsigned NOT NULL default '0',
`region_id` bigint(20) unsigned NOT NULL default '0',
`href` varchar(255) NOT NULL,
`title` varchar(50) NOT NULL,
`details` text NOT NULL,
`recommendations` varchar(255) NOT NULL default '',
`info` tinytext NOT NULL,
`address` varchar(100) NOT NULL default '',
`city` varchar(50) NOT NULL default '',
`phone` varchar(50) NOT NULL,
`secondary_phone` varchar(50) NOT NULL,
`cellphone` varchar(50) NOT NULL,
`fax` varchar(50) NOT NULL,
`website` varchar(50) NOT NULL default '',
`email` varchar(100) NOT NULL,
`contact_name` varchar(50) NOT NULL default '',
`sort` smallint(6) NOT NULL,
`map_display` tinyint(4) NOT NULL default '0',
`map_address` varchar(128) NOT NULL,
`map_geocode` varchar(128) NOT NULL,
`enable_vianet` tinyint(4) NOT NULL,
`location_id_vianet` int(11) NOT NULL,
`creation_date` timestamp NOT NULL default CURRENT_TIMESTAMP,
`last_updated` datetime NOT NULL,
`last_view_milestone` int(11) NOT NULL default '0',
`status` enum('active','disabled','pending','deleted') NOT NULL default
'active',
`type` enum('full','free') NOT NULL default 'full',
PRIMARY KEY (`id`),
KEY `company_id` (`company_id`),
KEY `region_id` (`region_id`),
KEY `category_id` (`category_id`),
KEY `category_teambuilding_id` (`category_teambuilding_id`),
FULLTEXT KEY `details` (`details`),
FULLTEXT KEY `all` (`title`,`details`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 PACK_KEYS=0 AUTO_INCREMENT=2669 ;
The query that slows a web page load down by typically 30 seconds is:
SELECT COUNT(*) AS count ,
TIMESTAMP(CONCAT(YEAR(t.timestamp), "-", MONTH(t.timestamp),
"-", DAYOFMONTH(t.timestamp))) AS division
FROM listing l, listing_impression t
WHERE l.id=t.listing_id
AND l.company_id=710
AND t.listing_type = 'full'
AND t.timestamp > DATE_SUB(NOW(), INTERVAL 7 day)
GROUP BY division
ORDER BY division
However the query I have made to summarise the listing impressions for the day
(to do data warehousing) is as follows:
SELECT COUNT(listing_id) AS count_impressions, listing_id
FROM listing_impression
WHERE timestamp LIKE '2008-05-06%'
GROUP BY listing_id
The listing table only has about 1800 records, whereas listing_impression over
26,000,000.
Ideas for speeding it all up are welcome.
Greg
--- On Wed, 13/5/09, ctx2002 <[email protected]> wrote:
> From: ctx2002 <[email protected]>
> Subject: [phpug] Re: [OT] Large Database Tables
> To: "NZ PHP Users Group" <[email protected]>
> Date: Wednesday, 13 May, 2009, 10:52 AM
>
> really need to see your table structure and sql , so we
> know what is
> problem.
>
>
>
> On May 12, 10:51 pm, "Gregor Brabyn" <[email protected]>
> wrote:
> > Anyone got any good advice on dealing with large
> database tables.
> >
> > I have a page in the client area of a website that
> normally takes over 30
> > seconds to load almost purely due to a query on a
> large database table. The
> > table records impressions for listings, each row
> represents one listing
> > getting an impression. The table now has over
> 26,000,000 rows accumulated
> > over a period just exceeding 1000 days.
> >
> > I have a plan to do a bit of data warehousing so that
> the total number of
> > impressions for a listing are summarized into one row
> in a separate table.
> > Once it is set up a CRON will be able to run each
> night and summarize the
> > impressions into this new table. Queries should be
> able to read data quickly
> > off this table.
> >
> > The problem I am having is that I need to summarize
> all the past data which
> > spans over 1000 days and I am finding that the MySQL
> query to collect one
> > day's worth of summaries can easily take 5 minutes. I
> am thinking that if I
> > run a CRON to summarize a day's data into the
> warehousing table every 15
> > minutes then it is going to take something like 11
> days.
> >
> > At present I am also unsure of the settings on the
> Apache server, I don't
> > think I will be able to have scripts operating for
> more that 10 minutes
> > before Apache stops them and at busy times of day my
> script to summarize a
> > day's data could easily end up exceeding this.
> >
> > Anyone got experience and good advice on dealing with
> these issues.
> >
> > Cheers
> >
> > Greg
> >
>
--~--~---------~--~----~------------~-------~--~----~
NZ PHP Users Group: http://groups.google.com/group/nzphpug
To post, send email to [email protected]
To unsubscribe, send email to
[email protected]
-~----------~----~----~----~------~----~------~--~---