Thejas M Nair created HIVE-6140: ----------------------------------- Summary: trim udf is very slow Key: HIVE-6140 URL: https://issues.apache.org/jira/browse/HIVE-6140 Project: Hive Issue Type: Bug Components: UDF Reporter: Thejas M Nair
Paraphrasing what was reported by [~cartershanklin] - I used the attached Perl script to generate 500 million two-character strings which always included a space. I loaded it using: create table letters (l string); load data local inpath '/home/sandbox/data.csv' overwrite into table letters; Then I ran this SQL script: select count(l) from letters where l = 'l '; select count(l) from letters where trim(l) = 'l'; First query = 170 seconds Second query = 514 seconds -- This message was sent by Atlassian JIRA (v6.1.5#6160)