Hi all.

I need some help figuring out how to update a many-to-many relationship
from a text file.

For the fun of it, let's call the tables 'sku' and 'agent'.  Each sku
can have multiple agents (usually < 10, always > 0), and each agent can
be associated with an unlimited number of records skus.  Let's say, for
example, we're using:

CREATE TABLE sku (
  id int unsigned NOT NULL auto_increment,
  sku varchar(60) NOT NULL default '',
  PRIMARY KEY  (id),
  UNIQUE KEY sku (sku),
);

CREATE TABLE agent (
  id mediumint unsigned NOT NULL auto_increment,
  agent varchar(60) NOT NULL default '',
  PRIMARY KEY  (id),
  UNIQUE KEY agent (agent),
);

CREATE TABLE skuagent (
  sku int(10) unsigned NOT NULL,
  agent mediumint(5) unsigned NOT NULL,
  UNIQUE KEY skuagent (sku,agent),
  KEY agentsku (agent,sku),
);

That's fine as far as it goes, but I can't figure out how to repopulate
the tables when new data comes in.  On a regular basis (probably once
every two days), I'll be getting a new text file with the relationships
in a format that looks like:

        SKU_ONE AGENT_ONE
        SKU_ONE AGENT_TWO
        SKU_TWO AGENT_ONE
        SKU_TWO AGENT_TWO
        SKU_TWO AGENT_THREE
        SKU_THREE AGENT_BLUE
        SKU_THREE AGENT_ORANGE

etc.  The text is what gets shoved into the varchar columns.  The input
text file does indeed have SKUs grouped as shown, so it's easy to `uniq`.

The problem is that with each update, I'll be getting a different set of
relationships on perhaps 3% of the SKUs, with some new SKUs and AGENTs,
and some removed.  I can add the new records easily enough, and "orphan"
records aren't a problem.  What I can't figure out is a good way to do
the many-to-many update.

I can do it by emptying the skuagent table every time I get a new file,
then re-populating it from scratch by running a new INSERT...SELECT for
each line in the file.  But the text file has over forty million lines,
with 15 million unique SKUs and half a million AGENTs.  I really don't
want to take the database down for hours at a time just to refresh.

Or alternately, a grottier solution.  Each time I get a new file, I can
run a script that will go through each unique SKU in the file, delete
the skuagent records for THAT SKU ONLY, and re-add them with current
data.  It's *way* more processing time for every host in the loop, but
at least I won't have to take the whole thing down.

I just can't seem to figure out an elegant solution.  Is there one, or
must I do this the ugly way?

Any advice would be appreciated.  :)

p

(And for the list robot: sql,query,queries,smallint)

-- 
  Paul Chvostek                                             <[EMAIL PROTECTED]>
  Operations / Abuse / Whatever
  it.canada, hosting and development                   http://www.it.ca/


---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to