Hello,
 
This is a long post, my apologies.
 
I have been working on migrating a database from MS SQL to MySQL for about 1
month now and am at a point where I can start running some tests to see how
they compare. I'll provide a little background to help you out.
 
The MS SQL database is 120million records long and 400 fields wide. It's all
in one table with 19 indexes. There is a cluster index on the ZIP code
field. This database is running on a DELL dual Xeon (MHz 2MB cpu cache, 4GB
RAM), DELL Powervault RAID5 storage array.
 
My initial plan was to simply move the big db/table structure from MS SQL to
MySQL but we found that indexing was a roadblock. The indexes (after 3 of
them) would start to take longer and longer. We are still performing testing
on this as it's possible that some of our hardware (DELL 8450s) may have
issues running with an EMC Symmetrix RAID0+1. In the end, we normalized the
data into 17 tables with no more than 3 indexes per table.
 
The MySQL database is 106million records (we got rid of some duplicates)
long and 250 fields (we got rid of some unused fields) wide. As mentioned
above, it's in 17 tables. This database is running on a HP quad Xeon (500MHz
512MB cpu cache, 8GB RAM), EMC Symmetrix storage array.
 
I am including the queries in this post to see if anyone has any advice on
what we could do to make these queries run faster. I realize that the
hardware isn't identical but I think it's close enough to get somewhat
accurate comparisons.
 
Here are the queries. the first query in each case is the MS SQL query. The
second query is the MySQL query. At the beginning section for the queries, I
indicate which fields are indexed (it's the same for both tables). MS SQL
beats MySQL handily except for one query and I'm not sure why. All queries
were run as the only process on each server.
 
QUERY1
~~~~~~~~~~~
Indexed Fields (link, phone_pander,state, exact_age, estimated_age, phone,
first, last, address)
MS SQL QUERY (TIME TO COMPLETE: 21m33s)
SELECT
phone,first,last,address,address2,city,state,zip,plus4,county,phone_pander_f
lag,homeowner,probable_homeowner,homeowner_probability_model,exact_age,estim
ated_age
FROM consumer1
WHERE phone_pander_flag <> 'Y'
AND STATE = 'PA'
AND (homeowner = 'Y' OR probable_homeowner IN ('8','9') OR
homeowner_probability_model BETWEEN '080' AND '102')
AND (exact_age BETWEEN '40' AND '60' OR estimated_age BETWEEN '40' AND '60')
AND phone > ' '
AND first > ' '
AND last > ' '
AND address > ' ';
 
MySQL QUERY(TIME TO COMPLETE: 42m27s)
SELECT 
 a1.phone, p1a.first, p1a.last, a3.address, a1.address2, a1.city, 
 state.state, a2.zip, a1.plus4, a3.county, 
 a1.phone_pander_flag, h1.homeowner, h2.probable_homeowner, 
 h2.homeowner_probability_model, p1c.exact_age, p1c.estimated_age
FROM a1
INNER JOIN a2    on a2.link    = a1.link
INNER JOIN a3    on a3.link    = a1.link
INNER JOIN h1    on h1.link    = a1.link
INNER JOIN h2    on h2.link    = a1.link
INNER JOIN p1a   on p1a.link   = a1.link
INNER JOIN p1c   on p1c.link   = a1.link
INNER JOIN state on state.state_code = a2.state_code
WHERE 
 a1.phone_pander_flag <> 'Y'
 AND state.state = 'PA'
 AND ( h1.homeowner = 'Y' 
  OR h2.probable_homeowner IN ('8','9') 
  OR h2.homeowner_probability_model BETWEEN '080' AND '102' )
 AND ( p1c.exact_age BETWEEN '40' AND '60'
  OR estimated_age BETWEEN '40' AND '60' )
 AND a1.phone is not null
 AND p1a.first is not null
 AND p1a.last is not null
 AND a3.address is not null;

QUERY2
~~~~~~~~
Indexed Fields (link, zip, phone_pander,phone, first, last, address)
MS SQL QUERY(12 seconds)
SELECT
phone,first,last,address,address2,city,state,zip,plus4,county,phone_pander_f
lag,homeowner,probable_homeowner,homeowner_probability_model,number_of_units
,primary_ethnic_market_code,household_hispanic
FROM consumer1
WHERE zip in ('60402','60650','60607','60608','60612','60622')
AND phone_pander_flag<>'y'
AND (homeowner='y' OR probable_homeowner in ('8','9') OR
homeowner_probability_model BETWEEN '080' AND '102')
AND phone > ' '
AND first > ' '
AND last > ' '
AND address > ' '
AND number_of_units IN ('0001','0002')
AND (primary_ethnic_market_code='38' OR household_hispanic>='0850')

MySQL QUERY(10min51secs)
SELECT 
 a1.phone, p1a.first, p1a.last, a3.address, a1.address2, a1.city, 
 state.state, a2.zip, a1.plus4, a3.county, a1.phone_pander_flag,
h1.homeowner, 
 h2.probable_homeowner, h2.homeowner_probability_model, 
 a1.number_of_units, p1a.primary_ethnic_market_code,
c2k.2000_census_hispanic
FROM a1
INNER JOIN a2    on a2.link    = a1.link
INNER JOIN a3    on a3.link    = a1.link
INNER JOIN h1    on h1.link    = a1.link
INNER JOIN h2    on h2.link    = a1.link
INNER JOIN p1a   on p1a.link   = a1.link
INNER JOIN c2k   on c2k.link   = a1.link
INNER JOIN state on state.state_code = a2.state_code
WHERE 
 a2.zip in ('60402', '60650', '60607', '60608', '60612', '60622')
 AND a1.phone_pander_flag <> 'Y'
 AND ( h1.homeowner='Y' 
  OR h2.probable_homeowner in ('8','9') 
  OR h2.homeowner_probability_model BETWEEN '080' AND '102')
 AND a1.phone is not null
 AND p1a.first is not null
 AND p1a.last is not null
 AND a3.address is not null
 AND a1.number_of_units IN ('0001','0002')
 AND ( p1a.primary_ethnic_market_code='38' 
  OR c2k.2000_census_hispanic >='0850' );

 
QUERY3
~~~~~~~~~
Indexed Fields (link, zip, phone_pander, phone, first, last, address)
MS SQL QUERY(10seconds)
SELECT
phone,first,last,address,address2,city,state,zip,plus4,county,home_purchase_
date,mortgage_amount_in_thousands,phone_pander_flag
FROM consumer1
WHERE ZIP IN ('14221','14224','14226','14227','14051','14031','14057')
AND phone_pander_flag <> 'Y'
AND phone > ' '
AND first > ' '
AND last > ' '
AND address > ' '
AND MORTGAGE_AMOUNT_IN_THOUSANDS>='0075'
AND SUBSTRING(HOME_PURCHASE_DATE,1,4)='2003'

MySQL QUERY(5min22secs)
SELECT 
 a1.phone, p1a.first, p1a.last, a3.address, a1.address2, a1.city, 
 state.state, a2.zip, a1.plus4, a3.county, h1.home_purchase_date, 
 mtg.mortgage_amount_in_thousands, a1.phone_pander_flag
FROM a1
INNER JOIN a2    on a2.link    = a1.link
INNER JOIN a3    on a3.link    = a1.link
INNER JOIN h1    on h1.link    = a1.link
INNER JOIN p1a   on p1a.link   = a1.link
INNER JOIN mtg   on mtg.link   = a1.link
INNER JOIN state on state.state_code = a2.state_code
WHERE 
 a2.zip IN ('14221', '14224', '14226', '14227', '14051', '14031', '14057')
 AND a1.phone_pander_flag <> 'Y'
 AND a1.phone is not null
 AND p1a.first is not null
 AND p1a.last is not null
 AND a3.address is not null
 AND mtg.mortgage_amount_in_thousands>='0075'
 AND (h1.home_purchase_date > '2003' and h1.home_purchase_date < '2004');

 
QUERY4
~~~~~~~~~~
Indexed Fields (link, state, first, last, address)
MS SQL QUERY(8hrs50mins)
SELECT
phone,first,last,address,address2,city,state,zip,plus4,county,mortgage_loan_
type,mortgage_amount_in_thousands,home_purchase_price,home_purchase_date
FROM consumer1
WHERE state IN ('PA','OH')
AND first > ' '
AND last > ' '
AND address > ' '
AND mortgage_loan_type='V'
AND (mortgage_amount_in_thousands>='0100' OR home_purchase_price>='0100000')
AND home_purchase_date between '2002' and '2003'

MySQL QUERY(2hrs17mins)
SELECT 
 a1.phone, p1a.first, p1a.last, a3.address, a1.address2, 
 a1.city, state.state, a2.zip, a1.plus4, a3.county, 
 mtg.mortgage_loan_type, mtg.mortgage_amount_in_thousands, 
 h1.home_purchase_price, h1.home_purchase_date
FROM a1
INNER JOIN a2    on a2.link    = a1.link
INNER JOIN a3    on a3.link    = a1.link
INNER JOIN h1    on h1.link    = a1.link
INNER JOIN p1a   on p1a.link   = a1.link
INNER JOIN mtg   on mtg.link   = a1.link
INNER JOIN state on state.state_code = a2.state_code
WHERE 
 state.state IN ('PA', 'OH')
 AND p1a.first is not null
 AND p1a.last is not null
 AND a3.address is not null
 AND mtg.mortgage_loan_type = 'V'
 AND (mtg.mortgage_amount_in_thousands>='0100' OR
h1.home_purchase_price>='0100000')
 AND h1.home_purchase_date between '2002' and '2003';

QUERY5
~~~~~~~~~~
Indexed Fields (link, state, first, last, address)
MS SQL QUERY(27min28secs)
SELECT
phone,first,last,address,address2,city,state,zip,plus4,county,phone_pander_f
lag,number_of_units,dwelling_type,homeowner_probability_model,home_property_
indicator
FROM consumer1
WHERE state='FL'
AND homeowner_probability_model<='020'
AND first > ' '
AND last > ' '
AND address > ' '
AND (number_of_units>='0003' OR dwelling_type='A')
AND home_property_indicator<>'2'

MySQL QUERY(3hrs53mins)
SELECT 
 a1.phone, p1a.first, p1a.last, a3.address, a1.address2, 
 a1.city, state.state, a2.zip, a1.plus4, a3.county, 
 a1.phone_pander_flag, a1.number_of_units, h1.dwelling_type, 
 h2.homeowner_probability_model, h1.home_property_indicator
FROM a1
INNER JOIN a2    on a2.link    = a1.link
INNER JOIN a3    on a3.link    = a1.link
INNER JOIN h1    on h1.link    = a1.link
INNER JOIN h2    on h2.link    = a1.link
INNER JOIN p1a   on p1a.link   = a1.link
INNER JOIN mtg   on mtg.link   = a1.link
INNER JOIN state on state.state_code = a2.state_code
WHERE 
 state.state = 'FL'
 AND h2.homeowner_probability_model <= '020'
 AND p1a.first is not null
 AND p1a.last is not null
 AND a3.address is not null
 AND ( a1.number_of_units>='0003' 
  OR h1.dwelling_type='A' )
 AND h1.home_property_indicator <> '2';

 
Any and all help is appreciated.
 
Thanks.
 
Chris.

Reply via email to