On 03/26/2013 06:08 AM, James Sharrett wrote:
I'm trying remove all instances of non-alphanumeric or underscore
characters from a query result for further use. This is part of a
function I'm writing that is in plpgsql
Examples:
Original value
'My text1'
'My text 2'
'My-text-3'
'My_text4'
'My!text5'
Desired
'Mytext1'
'Mytext2'
'Mytext3'
'My_text4' (no change)
'Mytext5'
The field containing the text is column_name. I tried the following:
Select regexp_replace(column_name,'\W','') from mytable
This deals with the correct characters but only does the first
instance of the character so the output is:
'My text1'
'Mytext 2' (wrong)
'Mytext-3' (wrong)
'My_text4'
'My!text5'
I managed to get the desired output by writing the text into a
variable through a loop and then just keep looping on the variable
until all the characters are removed:
sql_qry:= 'select column_name from mytable';
for sql_record in execute sql_qry loop
curr_record := sql_record.column_name;
while length(substring(curr_record from '\W'))>0 loop
curr_record := regexp_replace(curr_record, '\W','');
end loop;
.... rest of the code
This works but it seems like a lot of work to do something this simple
but I cannot find any function that will replace all instances of a
string AND can base it on a regular expression pattern. Is there a
better way to do this in 9.1?
You were on the right track with regexp_replace but you need to add a
global flag:
regexp_replace(column_name,'\W','','g')
See examples under
http://www.postgresql.org/docs/9.1/static/functions-matching.html#FUNCTIONS-POSIX-REGEXP
Cheers,
Steve