Aldo Bucchi wrote:
Hi,
I have a table that has random spaces appended to random varchar
values. ( that's just a FOL ).
I would like to run a procedure against the table and trim each
varchar value "in place" so that, after running the procedure on a
table, I could be assured that all cells are correctly trimmed.
i.e.
trim_all_cells_in_table( 'DB.DBA.MY_TABLE' );
What's the correct way to do this?
Iterate over the cursor and write in-place?
There may be other ways, but you could start from this:
create procedure trim_stuff(in tbl varchar) {
declare sql varchar;
for (select distinct COLUMN_NAME as cn from db.information_schema.columns
where table_name=tbl and data_type='VARCHAR') do {
sql:=sprintf('update %s set %s=trim(%s);', tbl, cn);
exec(sql, ......); -- see docs for exec() function
};
};
Some of these tables have 10s of millions of rows so performance is
definitely an issue.
I have no idea how (in)efficient the above will be, so will leave better
suggestions to others to contribute; but one tweak to consider is that you
could add the clause
... where %s like '% '
to make it only update & trim values that need it, but like is itself
expensive so that would depend on the *proportion* of offending rows.
Thinking about it some more, consider the nature of the columns in question:
*if* you know a field is only going to have a finite number of distinct values
when normalized, then you could use a lookup table for the permitted values,
as in a star-schema. That should be easy to construct and replace once, then
with foreign-key relationships you'd never have the problem again and integer
comparisons would be way fast. That may or may not be appropriate, however.
HTH,
~Tim
--
Tim Haynes
Product Development Consultant
OpenLink Software
<http://www.openlinksw.com/>
<http://twitter.com/openlink>