[ https://issues.apache.org/jira/browse/HIVE-6140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862177#comment-13862177 ]
Anandha L Ranganathan commented on HIVE-6140: --------------------------------------------- [~thejas]/[~cartershanklin] Could you provide data.csv file that caused the problem. Otherwise provide example of the data. > trim udf is very slow > --------------------- > > Key: HIVE-6140 > URL: https://issues.apache.org/jira/browse/HIVE-6140 > Project: Hive > Issue Type: Bug > Components: UDF > Reporter: Thejas M Nair > Assignee: Anandha L Ranganathan > > Paraphrasing what was reported by [~cartershanklin] - > I used the attached Perl script to generate 500 million two-character strings > which always included a space. I loaded it using: > create table letters (l string); > load data local inpath '/home/sandbox/data.csv' overwrite into table letters; > Then I ran this SQL script: > select count(l) from letters where l = 'l '; > select count(l) from letters where trim(l) = 'l'; > First query = 170 seconds > Second query = 514 seconds -- This message was sent by Atlassian JIRA (v6.1.5#6160)