LOAD DATA INFILE using 4.0.17

2005-02-15 Thread shaun thornburgh
Hi,
I have a table with 26 fields, each row in this table must be unique. I 
can't define all of the fields to be primary keys as the limit is 16. 
Therefore before I insert data I have to check that each row is unique. Data 
is to be inserted into this table from files uploaded to the server - CSV 
files. Is it possible to use the LOAD DATA INFILE statement to check for 
unique rows, or does anyone have a better solution to my problem!

Thanks for any advice offered.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


Re: LOAD DATA INFILE using 4.0.17

2005-02-15 Thread SGreen
shaun thornburgh [EMAIL PROTECTED] wrote on 02/15/2005 
04:53:54 PM:

 Hi,
 
 I have a table with 26 fields, each row in this table must be unique. I 
 can't define all of the fields to be primary keys as the limit is 16. 
 Therefore before I insert data I have to check that each row is unique. 
Data 
 is to be inserted into this table from files uploaded to the server - 
CSV 
 files. Is it possible to use the LOAD DATA INFILE statement to check for 

 unique rows, or does anyone have a better solution to my problem!
 
 Thanks for any advice offered.
 
 
 


What you want to do is called data scrubbing. Exactly how always depends 
on resources and the data itself however the general procedure works 
something like this:

*create an import table that matches your source data
*Import your data (without changes or omissions if at all possible) to 
this intermediate table.
*Validate your imported data to make sure you have everything you wanted 
from the LOAD DATA INFILE command.
*Eliminate any duplicate rows from your imported data table (many ways to 
do this. comparing the hash values for each row is one idea)
*use your import table as the source for your final update/insert
*drop your intermediate table.

I would normally get into more details but it's time to head home.

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

Re: LOAD DATA INFILE using 4.0.17

2005-02-15 Thread shaun thornburgh
Hi,
Thanks for your reply, but the problem I am facing is that there may be 
duplicate values in the uploaded file and I dont want these to appear in my 
table...

From: Bastian Balthazar Bux [EMAIL PROTECTED]
To: shaun thornburgh [EMAIL PROTECTED]
Subject: Re: LOAD DATA INFILE using 4.0.17
Date: Tue, 15 Feb 2005 23:32:56 +0100
shaun thornburgh ha scritto:
Hi,
I have a table with 26 fields, each row in this table must be unique. I 
can't define all of the fields to be primary keys as the limit is 16. 
Therefore before I insert data I have to check that each row is unique. 
Data is to be inserted into this table from files uploaded to the server - 
CSV files. Is it possible to use the LOAD DATA INFILE statement to check 
for unique rows, or does anyone have a better solution to my problem!

Thanks for any advice offered.

make a table of 27 fields ;) , use the MD5() function 
http://dev.mysql.com/doc/mysql/en/encryption-functions.html , simply add a 
unique index on the column that holds the md5 sum.

Regards, Francesco

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


Re: LOAD DATA INFILE using 4.0.17

2005-02-15 Thread Robert Dunlop
So what you meant was every field in each row must be unique from all other
instances in all other rows?  Not just each row must be unique?

Bob
- Original Message - 
From: shaun thornburgh [EMAIL PROTECTED]
To: mysql@lists.mysql.com
Sent: Tuesday, February 15, 2005 2:40 PM
Subject: Re: LOAD DATA INFILE using 4.0.17


 Hi,

 Thanks for your reply, but the problem I am facing is that there may be
 duplicate values in the uploaded file and I dont want these to appear in
my
 table...

 From: Bastian Balthazar Bux [EMAIL PROTECTED]
 To: shaun thornburgh [EMAIL PROTECTED]
 Subject: Re: LOAD DATA INFILE using 4.0.17
 Date: Tue, 15 Feb 2005 23:32:56 +0100
 
 shaun thornburgh ha scritto:
 
 Hi,
 
 I have a table with 26 fields, each row in this table must be unique. I
 can't define all of the fields to be primary keys as the limit is 16.
 Therefore before I insert data I have to check that each row is unique.
 Data is to be inserted into this table from files uploaded to the
server -
 CSV files. Is it possible to use the LOAD DATA INFILE statement to check
 for unique rows, or does anyone have a better solution to my problem!
 
 Thanks for any advice offered.
 
 
 
 make a table of 27 fields ;) , use the MD5() function
 http://dev.mysql.com/doc/mysql/en/encryption-functions.html , simply add
a
 unique index on the column that holds the md5 sum.
 
 Regards, Francesco



 -- 
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]




-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: LOAD DATA INFILE using 4.0.17

2005-02-15 Thread Bastian Balthazar Bux
Sorry for the private answer hitted the wrong replay button.
It's possible for you unload data with an SQL like this ?
SELECT list, of, fields,
   MD5 ( CONCAT ( list, of, fields ) )
INTO OUTFILE 'file_name'
FROM tab
if not (probably, you have csv files), you must use a shell script like 
this (maybe slow)

#! /bin/sh
export SEP=;
while read  myline ; do
   echo ${myline}${SEP}$(echo ${myline} | md5sum  | cut 
--characters=-32)
done  /etc/fstab

in M$ windows you must find an alternative.
BIG WARNINGs!
- This solution implies that forever you will be sticked to the same 
method / program
- the md5 produced from the shell script will be different from the one 
produced from the database

shaun thornburgh ha scritto:
Hi,

Thanks for your reply, but the problem I am facing is that there may 
be duplicate values in the uploaded file and I dont want these to appear 
in my table...

 From: Bastian Balthazar Bux [EMAIL PROTECTED]
 To: shaun thornburgh [EMAIL PROTECTED]
 Subject: Re: LOAD DATA INFILE using 4.0.17
 Date: Tue, 15 Feb 2005 23:32:56 +0100

 shaun thornburgh ha scritto:

 Hi,

 I have a table with 26 fields, each row in this table must be 
unique. I can't define all of the fields to be primary keys as the limit 
is 16. Therefore before I insert data I have to check that each row is 
unique. Data is to be inserted into this table from files uploaded to 
the server - CSV files. Is it possible to use the LOAD DATA INFILE 
statement to check for unique rows, or does anyone have a better 
solution to my problem!

 Thanks for any advice offered.



 make a table of 27 fields ;) , use the MD5() function 
http://dev.mysql.com/doc/mysql/en/encryption-functions.html , simply add 
a unique index on the column that holds the md5 sum.

 Regards, Francesco



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


Re: LOAD DATA INFILE using 4.0.17

2005-02-15 Thread shaun thornburgh
No just every row needs to be unique. Sorry for the confusion...

From: Robert Dunlop [EMAIL PROTECTED]
To: shaun thornburgh 
[EMAIL PROTECTED],mysql@lists.mysql.com
Subject: Re: LOAD DATA INFILE using 4.0.17
Date: Tue, 15 Feb 2005 15:06:19 -0800

So what you meant was every field in each row must be unique from all other
instances in all other rows?  Not just each row must be unique?
Bob
- Original Message -
From: shaun thornburgh [EMAIL PROTECTED]
To: mysql@lists.mysql.com
Sent: Tuesday, February 15, 2005 2:40 PM
Subject: Re: LOAD DATA INFILE using 4.0.17
 Hi,

 Thanks for your reply, but the problem I am facing is that there may be
 duplicate values in the uploaded file and I dont want these to appear in
my
 table...

 From: Bastian Balthazar Bux [EMAIL PROTECTED]
 To: shaun thornburgh [EMAIL PROTECTED]
 Subject: Re: LOAD DATA INFILE using 4.0.17
 Date: Tue, 15 Feb 2005 23:32:56 +0100
 
 shaun thornburgh ha scritto:
 
 Hi,
 
 I have a table with 26 fields, each row in this table must be unique. 
I
 can't define all of the fields to be primary keys as the limit is 16.
 Therefore before I insert data I have to check that each row is 
unique.
 Data is to be inserted into this table from files uploaded to the
server -
 CSV files. Is it possible to use the LOAD DATA INFILE statement to 
check
 for unique rows, or does anyone have a better solution to my problem!
 
 Thanks for any advice offered.
 
 
 
 make a table of 27 fields ;) , use the MD5() function
 http://dev.mysql.com/doc/mysql/en/encryption-functions.html , simply 
add
a
 unique index on the column that holds the md5 sum.
 
 Regards, Francesco



 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]