Looks to my like skuagent should be using the varchar fields instead of the int fields.
Then you could just insert ignore each line from your test file.

Paul Chvostek wrote:

Hi all.

I need some help figuring out how to update a many-to-many relationship
from a text file.

For the fun of it, let's call the tables 'sku' and 'agent'. Each sku
can have multiple agents (usually < 10, always > 0), and each agent can
be associated with an unlimited number of records skus. Let's say, for
example, we're using:

CREATE TABLE sku (
id int unsigned NOT NULL auto_increment,
sku varchar(60) NOT NULL default '',
PRIMARY KEY (id),
UNIQUE KEY sku (sku),
);

CREATE TABLE agent (
id mediumint unsigned NOT NULL auto_increment,
agent varchar(60) NOT NULL default '',
PRIMARY KEY (id),
UNIQUE KEY agent (agent),
);

CREATE TABLE skuagent (
sku int(10) unsigned NOT NULL,
agent mediumint(5) unsigned NOT NULL,
UNIQUE KEY skuagent (sku,agent),
KEY agentsku (agent,sku),
);

That's fine as far as it goes, but I can't figure out how to repopulate
the tables when new data comes in. On a regular basis (probably once
every two days), I'll be getting a new text file with the relationships
in a format that looks like:

SKU_ONE AGENT_ONE
SKU_ONE AGENT_TWO
SKU_TWO AGENT_ONE
SKU_TWO AGENT_TWO
SKU_TWO AGENT_THREE
SKU_THREE AGENT_BLUE
SKU_THREE AGENT_ORANGE

etc. The text is what gets shoved into the varchar columns. The input
text file does indeed have SKUs grouped as shown, so it's easy to `uniq`.

The problem is that with each update, I'll be getting a different set of
relationships on perhaps 3% of the SKUs, with some new SKUs and AGENTs,
and some removed. I can add the new records easily enough, and "orphan"
records aren't a problem. What I can't figure out is a good way to do
the many-to-many update.

I can do it by emptying the skuagent table every time I get a new file,
then re-populating it from scratch by running a new INSERT...SELECT for
each line in the file. But the text file has over forty million lines,
with 15 million unique SKUs and half a million AGENTs. I really don't
want to take the database down for hours at a time just to refresh.

Or alternately, a grottier solution. Each time I get a new file, I can
run a script that will go through each unique SKU in the file, delete
the skuagent records for THAT SKU ONLY, and re-add them with current
data. It's *way* more processing time for every host in the loop, but
at least I won't have to take the whole thing down.

I just can't seem to figure out an elegant solution. Is there one, or
must I do this the ugly way?

Any advice would be appreciated. :)

p

(And for the list robot: sql,query,queries,smallint)



---------------------------------------------------------------------
Before posting, please check:
  http://www.mysql.com/manual.php   (the manual)
  http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to