Then you could just insert ignore each line from your test file.
Paul Chvostek wrote:
Hi all.
I need some help figuring out how to update a many-to-many relationship
from a text file.
For the fun of it, let's call the tables 'sku' and 'agent'. Each sku
can have multiple agents (usually < 10, always > 0), and each agent can
be associated with an unlimited number of records skus. Let's say, for
example, we're using:
CREATE TABLE sku (
id int unsigned NOT NULL auto_increment,
sku varchar(60) NOT NULL default '',
PRIMARY KEY (id),
UNIQUE KEY sku (sku),
);
CREATE TABLE agent (
id mediumint unsigned NOT NULL auto_increment,
agent varchar(60) NOT NULL default '',
PRIMARY KEY (id),
UNIQUE KEY agent (agent),
);
CREATE TABLE skuagent (
sku int(10) unsigned NOT NULL,
agent mediumint(5) unsigned NOT NULL,
UNIQUE KEY skuagent (sku,agent),
KEY agentsku (agent,sku),
);
That's fine as far as it goes, but I can't figure out how to repopulate
the tables when new data comes in. On a regular basis (probably once
every two days), I'll be getting a new text file with the relationships
in a format that looks like:
SKU_ONE AGENT_ONE
SKU_ONE AGENT_TWO
SKU_TWO AGENT_ONE
SKU_TWO AGENT_TWO
SKU_TWO AGENT_THREE
SKU_THREE AGENT_BLUE
SKU_THREE AGENT_ORANGE
etc. The text is what gets shoved into the varchar columns. The input
text file does indeed have SKUs grouped as shown, so it's easy to `uniq`.
The problem is that with each update, I'll be getting a different set of
relationships on perhaps 3% of the SKUs, with some new SKUs and AGENTs,
and some removed. I can add the new records easily enough, and "orphan"
records aren't a problem. What I can't figure out is a good way to do
the many-to-many update.
I can do it by emptying the skuagent table every time I get a new file,
then re-populating it from scratch by running a new INSERT...SELECT for
each line in the file. But the text file has over forty million lines,
with 15 million unique SKUs and half a million AGENTs. I really don't
want to take the database down for hours at a time just to refresh.
Or alternately, a grottier solution. Each time I get a new file, I can
run a script that will go through each unique SKU in the file, delete
the skuagent records for THAT SKU ONLY, and re-add them with current
data. It's *way* more processing time for every host in the loop, but
at least I won't have to take the whole thing down.
I just can't seem to figure out an elegant solution. Is there one, or
must I do this the ugly way?
Any advice would be appreciated. :)
p
(And for the list robot: sql,query,queries,smallint)
--------------------------------------------------------------------- Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php