I would first try refactoring your SQL to use INNER JOIN statements instead of the comma separated lists you are currently using. I would also not use any subqueries. Test this and see if it works for you:
SELECT SUM(li.quantity) as qtysoldytd FROM LineItem li INNER JOIN Sales sa on li.saletranID=sa.saletranID and YEAR(sa.solddate)=2005 INNER JOIN Inventory inv on inv.invID = li.invID INNER JOIN Category cat on cat.categoryid = inv.categoryid AND cat.vendcode='AA'; The linkages work like this: 1) LineItem links into Sales through saletranID and YEAR(solddate) 2) Sales links into Inventory through the invID 3) Inventory links to Category through categoryid and vendcode Because I used INNER JOINs, each link in the chain must exist across all tables or the row cannot be added to the final results. Because this query contains several joins and your table sizes are not insignificant it becomes a candidate for what I call "piecewize evaluation". Piecewize evaluation is where you take the full query and build your desired results in stages. One stage that jumps out at me is the conversion of vendcode to a list of invID values. Another stage could be isolating just those line items for 2005. I suggest this because JOINing two tables (either by explicit declaration as I do or by comma-separated lists as you did) is a geometrically expensive operation (it's cost to compute grows by multiplying how many rows are participating from each table). If we start with two tables M and N and they each have m and n rows in them, a JOIN operation takes on the order of m*n cycles to compute. If we can somehow shrink each table participating in the JOIN (by pre-selecting certain rows) so that we now have m/4 and n/2 rows to JOIN that reduces your overall cost to (m * n)/8. When we are discussing products of m*n on the order of 100 million rows or so, reducing production time by a factor of 8 is noticable. The situation is even more apparent if you add more tables. Consider if you had tables A, B, and C and they had a,b, and c rows in them. If you had to JOIN those three tables to build a query it would take a*b*c units of time to complete. If we were only able to reduce each table by 10%, that reduces the overall computation to (.9*a)*(.9*b)*(.9*c) = .729(abc) If: a = 50,000 b = 500,000 c = 800,000 records The original execution cost is proportional to: (50000 * 500000 * 800000) = 20000000000000000 (2.0e16) after 10% reductions through precomputations: 2.0e16 * .729 = 1.458e16 --------------------------------------------------------------------------- # of rows combinations NOT fed through the CPU to be evaluated as being in the result or not: 2.0e16 - 1.458e16 = 5.42e+15 = 5420000000000000 How long do you think it takes even a modern computer to do 5420000000000000 tests? It can make a serious difference. Piecewize evaluation works VERY WELL in stored procedures (if you are on v5.0 or higher) because you can parameterize your queries quite easily and you are assured of executing the same query pattern every time you need it. ## stage 1 - identifying Line items from 2005 CREATE TEMPORARY TABLE tmpLI ( KEY(invID) ) SELECT li.invID, li.quantity FROM LineItem li INNER JOIN Sales sa on li.saletranID=sa.saletranID and YEAR(sa.solddate)=2005 ## stage 2 - identifying Inventory Items for a certain category CREATE TEMPORARY TABLE tmpInv ( KEY(invID) ) SELECT DISTINCT invID FROM Inventory inv on inv.invID = li.invID INNER JOIN Category cat on cat.categoryid = inv.categoryid AND cat.vendcode='AA'; ## stage 3 - compute your desired results SELECT SUM(li.quantity) FROM tmpLI li INNER JOIN tmpInf inv ON inv.invID = li.invID; ## stage 4 - the database is not your momma. Clean up after yourself... DROP TEMPORARY TABLE tmpLi; DROP TEMPORARY TABLE tmpInv; ## end query I hope that helps (HTH), Shawn Green Database Administrator Unimin Corporation - Spruce Pine [EMAIL PROTECTED] wrote on 11/04/2005 12:28:50 AM: > I cannot figure this one out. I have a Category table with 50,000 > records, an Inventory table with over 2 million records. A Sales table > with 500,000 records. And a LineItem table with 800,000 records pairing > the Inventory ID with the Sales Transaction ID and Quantity. I need to > generate a Quantity sold year to date for a certain vendor. The vendor > code can be found in the Category table which has a relationship with > Inventory. I am trying a SQL statement like this: > > select sum(li.quantity) as qtysoldytd from LineItem li, Sales sa where > (li.saletranID=sa.saletranID and YEAR(sa.solddate)=2005) and li.InvID > IN (select invID from Inventory where categoryid IN (select categoryid > from Category where vendcode='AA')) > > this yields null when I know there are sales for that vendor in 2005. > Simplified schemas for the tables are as follows: > Category: > +----------------+------------------+------+-----+---------+----------------+ > | Field | Type | Null | Key | Default | Extra > | > +----------------+------------------+------+-----+---------+----------------+ > | vendcode | char(3) | YES | MUL | NULL | > | > | categoryID | int(10) unsigned | | PRI | NULL | > auto_increment | > +----------------+------------------+------+-----+---------+----------------+ > > Inventory: > +--------------+---------------+------+-----+---------+-------+ > | Field | Type | Null | Key | Default | Extra | > +--------------+---------------+------+-----+---------+-------+ > | categoryID | int(11) | YES | MUL | NULL | | > | invID | int(10) | | PRI | 0 | | > | itemnum | int(11) | YES | MUL | NULL | | > +--------------+---------------+------+-----+---------+-------+ > > Sales: > +--------------+------------------+------+-----+---------+----------------+ > | Field | Type | Null | Key | Default | Extra > | > +--------------+------------------+------+-----+---------+----------------+ > | saletranID | int(10) unsigned | | PRI | NULL | > auto_increment | > | solddate | datetime | YES | | NULL | > | > +--------------+------------------+------+-----+---------+----------------+ > > LineItem: > +------------+---------+------+-----+---------+-------+ > | Field | Type | Null | Key | Default | Extra | > +------------+---------+------+-----+---------+-------+ > | invID | int(10) | YES | MUL | NULL | | > | quantity | int(10) | YES | | NULL | | > | saletranID | int(10) | YES | MUL | NULL | | > +------------+---------+------+-----+---------+-------+ > > Can anybody shed some light on this and if this is even possible. I have > indexes in place and the query is still slow to pull. > Thanks a million, > Nathan > > > -- > MySQL General Mailing List > For list archives: http://lists.mysql.com/mysql > To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] >