Aldo Bucchi wrote:
Hi,

I have a table that has random spaces appended to random varchar
values. ( that's just a FOL ).
I would like to run a procedure against the table and trim each
varchar value "in place" so that, after running the procedure on a
table, I could be assured that all cells are correctly trimmed.

i.e.
trim_all_cells_in_table( 'DB.DBA.MY_TABLE' );

What's the correct way to do this?
Iterate over the cursor and write in-place?

There may be other ways, but you could start from this:

create procedure trim_stuff(in tbl varchar) {
  declare sql varchar;
for (select distinct COLUMN_NAME as cn from db.information_schema.columns
        where table_name=tbl and data_type='VARCHAR') do {
    sql:=sprintf('update %s set %s=trim(%s);', tbl, cn);
    exec(sql, ......);    -- see docs for exec() function
  };
};

Some of these tables have 10s of millions of rows so performance is
definitely an issue.

I have no idea how (in)efficient the above will be, so will leave better suggestions to others to contribute; but one tweak to consider is that you could add the clause
   ... where %s like '% '
to make it only update & trim values that need it, but like is itself expensive so that would depend on the *proportion* of offending rows.

Thinking about it some more, consider the nature of the columns in question: *if* you know a field is only going to have a finite number of distinct values when normalized, then you could use a lookup table for the permitted values, as in a star-schema. That should be easy to construct and replace once, then with foreign-key relationships you'd never have the problem again and integer comparisons would be way fast. That may or may not be appropriate, however.

HTH,

~Tim
--
Tim Haynes
Product Development Consultant
OpenLink Software
<http://www.openlinksw.com/>
<http://twitter.com/openlink>

Reply via email to