Re: updating many-to-many relationships?

2003-02-20 Thread gerald_clark
I don't have a mailing list, and I don't appreciate your email.
You subscribed to the mysql mailing list, or you would not be getting it.
The instructions to unsubscribe are at the bottom of every message.

Kip McGee wrote:


will you please take me off of your mailing list if you dont im gonna 
send my knee breaker QUEDO SARDUCHIE
Please do not make me send Quedo  he is a ornery ass whipping midgit.
I have been getting over a hundred e mails every day from your company
and i cannot even use my computer becouse it is so full caca!!
I must be on your mailing list and i wish to be taken off of it 
immediatly.
my email address is [EMAIL PROTECTED]





-
Before posting, please check:
  http://www.mysql.com/manual.php   (the manual)
  http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




updating many-to-many relationships?

2003-02-19 Thread Paul Chvostek

Hi all.

I need some help figuring out how to update a many-to-many relationship
from a text file.

For the fun of it, let's call the tables 'sku' and 'agent'.  Each sku
can have multiple agents (usually  10, always  0), and each agent can
be associated with an unlimited number of records skus.  Let's say, for
example, we're using:

CREATE TABLE sku (
  id int unsigned NOT NULL auto_increment,
  sku varchar(60) NOT NULL default '',
  PRIMARY KEY  (id),
  UNIQUE KEY sku (sku),
);

CREATE TABLE agent (
  id mediumint unsigned NOT NULL auto_increment,
  agent varchar(60) NOT NULL default '',
  PRIMARY KEY  (id),
  UNIQUE KEY agent (agent),
);

CREATE TABLE skuagent (
  sku int(10) unsigned NOT NULL,
  agent mediumint(5) unsigned NOT NULL,
  UNIQUE KEY skuagent (sku,agent),
  KEY agentsku (agent,sku),
);

That's fine as far as it goes, but I can't figure out how to repopulate
the tables when new data comes in.  On a regular basis (probably once
every two days), I'll be getting a new text file with the relationships
in a format that looks like:

SKU_ONE AGENT_ONE
SKU_ONE AGENT_TWO
SKU_TWO AGENT_ONE
SKU_TWO AGENT_TWO
SKU_TWO AGENT_THREE
SKU_THREE AGENT_BLUE
SKU_THREE AGENT_ORANGE

etc.  The text is what gets shoved into the varchar columns.  The input
text file does indeed have SKUs grouped as shown, so it's easy to `uniq`.

The problem is that with each update, I'll be getting a different set of
relationships on perhaps 3% of the SKUs, with some new SKUs and AGENTs,
and some removed.  I can add the new records easily enough, and orphan
records aren't a problem.  What I can't figure out is a good way to do
the many-to-many update.

I can do it by emptying the skuagent table every time I get a new file,
then re-populating it from scratch by running a new INSERT...SELECT for
each line in the file.  But the text file has over forty million lines,
with 15 million unique SKUs and half a million AGENTs.  I really don't
want to take the database down for hours at a time just to refresh.

Or alternately, a grottier solution.  Each time I get a new file, I can
run a script that will go through each unique SKU in the file, delete
the skuagent records for THAT SKU ONLY, and re-add them with current
data.  It's *way* more processing time for every host in the loop, but
at least I won't have to take the whole thing down.

I just can't seem to figure out an elegant solution.  Is there one, or
must I do this the ugly way?

Any advice would be appreciated.  :)

p

(And for the list robot: sql,query,queries,smallint)

-- 
  Paul Chvostek [EMAIL PROTECTED]
  Operations / Abuse / Whatever
  it.canada, hosting and development   http://www.it.ca/


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Re: updating many-to-many relationships?

2003-02-19 Thread gerald_clark
Looks to my like skuagent should be using the varchar fields instead of 
the int fields.
Then you could just insert ignore each line from your test file.

Paul Chvostek wrote:

Hi all.

I need some help figuring out how to update a many-to-many relationship
from a text file.

For the fun of it, let's call the tables 'sku' and 'agent'.  Each sku
can have multiple agents (usually  10, always  0), and each agent can
be associated with an unlimited number of records skus.  Let's say, for
example, we're using:

CREATE TABLE sku (
 id int unsigned NOT NULL auto_increment,
 sku varchar(60) NOT NULL default '',
 PRIMARY KEY  (id),
 UNIQUE KEY sku (sku),
);

CREATE TABLE agent (
 id mediumint unsigned NOT NULL auto_increment,
 agent varchar(60) NOT NULL default '',
 PRIMARY KEY  (id),
 UNIQUE KEY agent (agent),
);

CREATE TABLE skuagent (
 sku int(10) unsigned NOT NULL,
 agent mediumint(5) unsigned NOT NULL,
 UNIQUE KEY skuagent (sku,agent),
 KEY agentsku (agent,sku),
);

That's fine as far as it goes, but I can't figure out how to repopulate
the tables when new data comes in.  On a regular basis (probably once
every two days), I'll be getting a new text file with the relationships
in a format that looks like:

	SKU_ONE AGENT_ONE
	SKU_ONE AGENT_TWO
	SKU_TWO AGENT_ONE
	SKU_TWO AGENT_TWO
	SKU_TWO AGENT_THREE
	SKU_THREE AGENT_BLUE
	SKU_THREE AGENT_ORANGE

etc.  The text is what gets shoved into the varchar columns.  The input
text file does indeed have SKUs grouped as shown, so it's easy to `uniq`.

The problem is that with each update, I'll be getting a different set of
relationships on perhaps 3% of the SKUs, with some new SKUs and AGENTs,
and some removed.  I can add the new records easily enough, and orphan
records aren't a problem.  What I can't figure out is a good way to do
the many-to-many update.

I can do it by emptying the skuagent table every time I get a new file,
then re-populating it from scratch by running a new INSERT...SELECT for
each line in the file.  But the text file has over forty million lines,
with 15 million unique SKUs and half a million AGENTs.  I really don't
want to take the database down for hours at a time just to refresh.

Or alternately, a grottier solution.  Each time I get a new file, I can
run a script that will go through each unique SKU in the file, delete
the skuagent records for THAT SKU ONLY, and re-add them with current
data.  It's *way* more processing time for every host in the loop, but
at least I won't have to take the whole thing down.

I just can't seem to figure out an elegant solution.  Is there one, or
must I do this the ugly way?

Any advice would be appreciated.  :)

p

(And for the list robot: sql,query,queries,smallint)

 




-
Before posting, please check:
  http://www.mysql.com/manual.php   (the manual)
  http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Re: updating many-to-many relationships?

2003-02-19 Thread Paul Chvostek

I *really* don't want to have a pair of indexes on 60-character varchars
that cover forty million records.  That would increase the size of each
index from 280MB to 4.8GB.  Just for the indexes.  I don't think so.

Also, I don't see how it would address the issue of updates.  The
problem is that for each sku, the list of agents is *changing*.  Some
will be added, some will be removed.  I'm already doing an insert ignore
to top up the sku and agent tables, so populating the skuagent table
with new entries consists of:

 INSERT INTO skuagent (sku,agent) SELECT sku.id,agent.id FROM sku,agent
   WHERE sku.sku='%s' AND agent.agent='%s';

for each line in the input text file.  But the more I think about this,
the more it looks as if I'm going to have to process things the slow and
ugly way

Thanks anyway for looking at this.

p

On Wed, Feb 19, 2003 at 08:13:12AM -0600, gerald_clark wrote:
 
 Looks to my like skuagent should be using the varchar fields instead of 
 the int fields.
 Then you could just insert ignore each line from your test file.
 
 Paul Chvostek wrote:
 
 Hi all.
 
 I need some help figuring out how to update a many-to-many relationship
 from a text file.
 
 For the fun of it, let's call the tables 'sku' and 'agent'.  Each sku
 can have multiple agents (usually  10, always  0), and each agent can
 be associated with an unlimited number of records skus.  Let's say, for
 example, we're using:
 
 CREATE TABLE sku (
   id int unsigned NOT NULL auto_increment,
   sku varchar(60) NOT NULL default '',
   PRIMARY KEY  (id),
   UNIQUE KEY sku (sku),
 );
 
 CREATE TABLE agent (
   id mediumint unsigned NOT NULL auto_increment,
   agent varchar(60) NOT NULL default '',
   PRIMARY KEY  (id),
   UNIQUE KEY agent (agent),
 );
 
 CREATE TABLE skuagent (
   sku int(10) unsigned NOT NULL,
   agent mediumint(5) unsigned NOT NULL,
   UNIQUE KEY skuagent (sku,agent),
   KEY agentsku (agent,sku),
 );
 
 That's fine as far as it goes, but I can't figure out how to repopulate
 the tables when new data comes in.  On a regular basis (probably once
 every two days), I'll be getting a new text file with the relationships
 in a format that looks like:
 
  SKU_ONE AGENT_ONE
  SKU_ONE AGENT_TWO
  SKU_TWO AGENT_ONE
  SKU_TWO AGENT_TWO
  SKU_TWO AGENT_THREE
  SKU_THREE AGENT_BLUE
  SKU_THREE AGENT_ORANGE
 
 etc.  The text is what gets shoved into the varchar columns.  The input
 text file does indeed have SKUs grouped as shown, so it's easy to `uniq`.
 
 The problem is that with each update, I'll be getting a different set of
 relationships on perhaps 3% of the SKUs, with some new SKUs and AGENTs,
 and some removed.  I can add the new records easily enough, and orphan
 records aren't a problem.  What I can't figure out is a good way to do
 the many-to-many update.
 
 I can do it by emptying the skuagent table every time I get a new file,
 then re-populating it from scratch by running a new INSERT...SELECT for
 each line in the file.  But the text file has over forty million lines,
 with 15 million unique SKUs and half a million AGENTs.  I really don't
 want to take the database down for hours at a time just to refresh.
 
 Or alternately, a grottier solution.  Each time I get a new file, I can
 run a script that will go through each unique SKU in the file, delete
 the skuagent records for THAT SKU ONLY, and re-add them with current
 data.  It's *way* more processing time for every host in the loop, but
 at least I won't have to take the whole thing down.
 
 I just can't seem to figure out an elegant solution.  Is there one, or
 must I do this the ugly way?
 
 Any advice would be appreciated.  :)
 
 p
 
 (And for the list robot: sql,query,queries,smallint)
 
   
 
 
 
 
 -
 Before posting, please check:
http://www.mysql.com/manual.php   (the manual)
http://lists.mysql.com/   (the list archive)
 
 To request this thread, e-mail [EMAIL PROTECTED]
 To unsubscribe, e-mail [EMAIL PROTECTED]
 Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

-- 
  Paul Chvostek [EMAIL PROTECTED]
  Operations / Abuse / Whatever
  it.canada, hosting and development   http://www.it.ca/


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php