Hi, and thanks Baron;
I should have been a bit clearer on the bulk insert - I am using a bulk
insert statement, as you assumed.
I'll put this onto the db server and check, I think that's a more future
proof method.  Will this affect any of my linked tables (linked via the
row's primary key(id))?

-----Original Message-----
From: Baron Schwartz [mailto:[EMAIL PROTECTED] 
Sent: 31 August 2007 14:28
To: Esbach, Brandon
Cc: MySQL User Group
Subject: Re: Two/more seperately unique columns in a table?

Hi,

Esbach, Brandon wrote:
> Hi all,
> 
> I'm guessing this is a fairly common question, but I've drawn a blank 
> so far with all workarounds possible.
> Here's the rough scenario:
> We have data streaming in constantly from various offsite locations.
> This comprises of several details: a session, a unit of that session, 
> and a measurement of that unit.
> Data is captured and transferred on a timed process, so often data is 
> repeated in the data transfer packets.  This repeating is unavoidable 
> as the software used to capture dumps this data for each measurement 
> and each unit for the session it's working on.
> 
> Due to the volume, a bulk update is done using "values()" with an 
> insert statement.
> 
> Unfortunately, often there are repeats of either session, unit, or 
> measurement (as there could be new measurement or unit for the capture

> that needs to be associated to the session).
> The problem I've been experiencing is fairly straightforward (I hope):

> I have two, sometimes three columns in any given record that need to 
> always be unique.  This comprises an ID (the key column), and one 
> (depending on the table, sometimes two) GUIDs which should be unique 
> at all times for the entire table.
> 
> I've tried setting the additional columns to be a primary key (which 
> in turn sets them up to be "unique" when viewed under "Schema Indices"

> on the MySQL Administrator tool); however this does not give an error 
> (or fail silently with "insert ignore") when I insert a duplicate - 
> mySQL seems quite happy to add the duplicate record.
> 
> At the moment, I'm running a process in the beginning which simply 
> gathers all guids from the database and compares them as it runs 
> through the data (then adds new ones as it runs).. This is hardly 
> reliable, and also means starting the service would take several hours

> to gather the existing guids at current data levels... almost 
> frightening to think, what will end up happening as the data expands.

It sounds like you need a separate primary key and unique index:

create table t (
id int not null,
guid char(32) not null,
unique key (guid),
primary key(id)
);

Then you can do REPLACE or IGNORE with the LOAD DATA INFILE.  I can't
tell if you are actually using LOAD DATA INFILE or if your "bulk load" 
is a big INSERT statement.  If you're using an INSERT with multiple
VALUES() sections, you can also use ON DUPLICATE KEY UPDATE.

I agree the current strategy won't hold up well over time.

Baron

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to