[ https://issues.apache.org/jira/browse/HIVE-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mitja Trampus updated HIVE-2620: -------------------------------- Description: Whenever you have a LIKE expression that contains "|+" (the culprit) and "%" (so it gets converted to regex), hive throws an exception that crashes the whole job. Possibly related: https://issues.apache.org/jira/browse/HIVE-2594 {noformat} hive> select 'foo |+18| bar' like 'foo |+18% bar' from akramer_one_row; FAILED: Error in semantic analysis: Line 1:7 Wrong arguments ''foo |+18% bar'': org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public org.apache.hadoop.io.BooleanWritable org.apache.hadoop.hive.ql.udf.UDFLike.evaluate(org.apache.hadoop.io.Text,org.apache.hadoop.io.Text) on object org.apache.hadoop.hive.ql.udf.UDFLike@292e2fba of class org.apache.hadoop.hive.ql.udf.UDFLike with arguments {foo |+18| bar:org.apache.hadoop.io.Text, foo |+% bar:org.apache.hadoop.io.Text} of size 2 {noformat} Stack trace from the real world example with which I found this: {noformat} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public org.apache.hadoop.io.BooleanWritable org.apache.hadoop.hive.ql.udf.UDFLike.evaluate(org.apache.hadoop.io.Text,org.apache.hadoop.io.Text) on object org.apache.hadoop.hive.ql.udf.UDFLike@4a7baf7d of class org.apache.hadoop.hive.ql.udf.UDFLike with arguments {ewt.arkadaslar pazartesinden sonra ozel escortlar sayfamızı zıyaret etcek lutfn kaba dawranmıyalım escortlarımız resmlı olcak sız begenıceksınız escortunuzu escortlarımı ıl ıl olacktır bılgnıze:org.apache.hadoop.io.Text, %çıtıR%kızLar%escort%kızLarı%burda%|+%18%|%:org.apache.hadoop.io.Text} of size 2 at org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:836) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.evaluate(GenericUDFBridge.java:180) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.evaluate(ExprNodeGenericFuncEvaluator.java:163) at org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:575) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:767) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:722) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:765) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:765) at org.apache.hadoop.hive.ql.exec.UnionOperator.processOp(UnionOperator.java:129) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:765) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:531) ... 5 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:812) ... 19 more Caused by: java.util.regex.PatternSyntaxException: Dangling meta character '+' near index 42 .*çıtıR.*kızLar.*escort.*kızLarı.*burda.*|+.*18.*|.* ^ at java.util.regex.Pattern.error(Pattern.java:1713) at java.util.regex.Pattern.sequence(Pattern.java:1878) at java.util.regex.Pattern.expr(Pattern.java:1752) at java.util.regex.Pattern.compile(Pattern.java:1460) at java.util.regex.Pattern.<init>(Pattern.java:1133) at java.util.regex.Pattern.compile(Pattern.java:823) at org.apache.hadoop.hive.ql.udf.UDFLike.evaluate(UDFLike.java:186) ... 23 more {noformat} was: Whenever you have a LIKE expression that contains "|+" (the culprit) and "%" (so it gets converted to regex), hive throws an exception that crashes the whole job. hive> select 'foo |+18| bar' like 'foo |+18% bar' from akramer_one_row; FAILED: Error in semantic analysis: Line 1:7 Wrong arguments ''foo |+18% bar'': org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public org.apache.hadoop.io.BooleanWritable org.apache.hadoop.hive.ql.udf.UDFLike.evaluate(org.apache.hadoop.io.Text,org.apache.hadoop.io.Text) on object org.apache.hadoop.hive.ql.udf.UDFLike@292e2fba of class org.apache.hadoop.hive.ql.udf.UDFLike with arguments {foo |+18| bar:org.apache.hadoop.io.Text, foo |+% bar:org.apache.hadoop.io.Text} of size 2 Stack trace from the real world example with which I found this: Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public org.apache.hadoop.io.BooleanWritable org.apache.hadoop.hive.ql.udf.UDFLike.evaluate(org.apache.hadoop.io.Text,org.apache.hadoop.io.Text) on object org.apache.hadoop.hive.ql.udf.UDFLike@4a7baf7d of class org.apache.hadoop.hive.ql.udf.UDFLike with arguments {ewt.arkadaslar pazartesinden sonra ozel escortlar sayfamızı zıyaret etcek lutfn kaba dawranmıyalım escortlarımız resmlı olcak sız begenıceksınız escortunuzu escortlarımı ıl ıl olacktır bılgnıze:org.apache.hadoop.io.Text, %çıtıR%kızLar%escort%kızLarı%burda%|+%18%|%:org.apache.hadoop.io.Text} of size 2 at org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:836) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.evaluate(GenericUDFBridge.java:180) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.evaluate(ExprNodeGenericFuncEvaluator.java:163) at org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:575) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:767) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:722) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:765) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:765) at org.apache.hadoop.hive.ql.exec.UnionOperator.processOp(UnionOperator.java:129) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:765) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:531) ... 5 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:812) ... 19 more Caused by: java.util.regex.PatternSyntaxException: Dangling meta character '+' near index 42 .*çıtıR.*kızLar.*escort.*kızLarı.*burda.*|+.*18.*|.* ^ at java.util.regex.Pattern.error(Pattern.java:1713) at java.util.regex.Pattern.sequence(Pattern.java:1878) at java.util.regex.Pattern.expr(Pattern.java:1752) at java.util.regex.Pattern.compile(Pattern.java:1460) at java.util.regex.Pattern.<init>(Pattern.java:1133) at java.util.regex.Pattern.compile(Pattern.java:823) at org.apache.hadoop.hive.ql.udf.UDFLike.evaluate(UDFLike.java:186) ... 23 more > LIKE incorrectly transforms expression to regex (does not escape "+" and > possibly other special chars) > ------------------------------------------------------------------------------------------------------ > > Key: HIVE-2620 > URL: https://issues.apache.org/jira/browse/HIVE-2620 > Project: Hive > Issue Type: Bug > Components: UDF > Reporter: Mitja Trampus > > Whenever you have a LIKE expression that contains "|+" (the culprit) and "%" > (so it gets converted to regex), hive throws an exception that crashes the > whole job. > Possibly related: https://issues.apache.org/jira/browse/HIVE-2594 > {noformat} > hive> select 'foo |+18| bar' like 'foo |+18% bar' from akramer_one_row; > FAILED: Error in semantic analysis: Line 1:7 Wrong arguments ''foo |+18% > bar'': org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute > method public org.apache.hadoop.io.BooleanWritable > org.apache.hadoop.hive.ql.udf.UDFLike.evaluate(org.apache.hadoop.io.Text,org.apache.hadoop.io.Text) > on object org.apache.hadoop.hive.ql.udf.UDFLike@292e2fba of class > org.apache.hadoop.hive.ql.udf.UDFLike with arguments {foo |+18| > bar:org.apache.hadoop.io.Text, foo |+% bar:org.apache.hadoop.io.Text} of size > 2 > {noformat} > Stack trace from the real world example with which I found this: > {noformat} > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to > execute method public org.apache.hadoop.io.BooleanWritable > org.apache.hadoop.hive.ql.udf.UDFLike.evaluate(org.apache.hadoop.io.Text,org.apache.hadoop.io.Text) > on object org.apache.hadoop.hive.ql.udf.UDFLike@4a7baf7d of class > org.apache.hadoop.hive.ql.udf.UDFLike with arguments {ewt.arkadaslar > pazartesinden sonra ozel escortlar sayfamızı zıyaret etcek lutfn kaba > dawranmıyalım escortlarımız resmlı olcak sız begenıceksınız escortunuzu > escortlarımı ıl ıl olacktır bılgnıze:org.apache.hadoop.io.Text, > %çıtıR%kızLar%escort%kızLarı%burda%|+%18%|%:org.apache.hadoop.io.Text} of > size 2 > at > org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:836) > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.evaluate(GenericUDFBridge.java:180) > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.evaluate(ExprNodeGenericFuncEvaluator.java:163) > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:575) > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:767) > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:722) > at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:765) > at > org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) > at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:765) > at > org.apache.hadoop.hive.ql.exec.UnionOperator.processOp(UnionOperator.java:129) > at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:765) > at > org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:531) > ... 5 more > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:812) > ... 19 more > Caused by: java.util.regex.PatternSyntaxException: Dangling meta character > '+' near index 42 > .*çıtıR.*kızLar.*escort.*kızLarı.*burda.*|+.*18.*|.* > ^ > at java.util.regex.Pattern.error(Pattern.java:1713) > at java.util.regex.Pattern.sequence(Pattern.java:1878) > at java.util.regex.Pattern.expr(Pattern.java:1752) > at java.util.regex.Pattern.compile(Pattern.java:1460) > at java.util.regex.Pattern.<init>(Pattern.java:1133) > at java.util.regex.Pattern.compile(Pattern.java:823) > at org.apache.hadoop.hive.ql.udf.UDFLike.evaluate(UDFLike.java:186) > ... 23 more > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira