Hi list subscribers,

i am inserting millions of product rows from csv files via LOAD DATA INFILE. 
Every product has a price, but this "price-strings" vary heavily.
The main difference between them, is the decimal format. Sometimes it is 
european like this: 
1.000.000,00
sometimes its american like this:
1,000,000.00

so some examples for these strings are:
EUR 1,00                 (meaning decimal 1.00)
1.00 €             (meaning decimal 1.00)
1.000,00 EUR          (meaning decimal 1000.00)
EUR 1.000,00          (meaning decimal 1000.00)
1.000 EUR               (meaning decimal 1000.00)
1,000 EUR               (meaning decimal 1000.00)

and now, i want to filter/validate that directly within the query.
I think some RegEx could do the trick, but this is to much for me. Ive searched 
for some RegEx for validating decimals, but they allways use just one notation 
of a decimal. I think a good strategy for that, to filter all chars but 
[0-9\.,], and than to do some logic like to check how man chars after the last 
dot or comma (if its two you know its something like 0.00).

No my problem is, i don't know where to start. Maybe with a stored procedure or 
something like that? I don't wanna use another language for this, because i 
would have to do some comprehensive update work then (selecting every row, 
checking the price, updating the price), and these rows get written every day, 
so i would have to do these updates once a day on a couple of million rows.

Anyone get me in the right direction?
Thanks in advance
Regards Benjamin Bittner

                
---------------------------------
Was ist Glück? Schlafen Fische überhaupt? Die Antworten gibt’s auf Yahoo! 
Clever.

Reply via email to