[jira] [Updated] (SPARK-6747) Throw an AnalysisException when unsupported Java list types used in Hive UDF
[ https://issues.apache.org/jira/browse/SPARK-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro updated SPARK-6747: Description: The current implementation can't handle List as a return type in Hive UDF and throws meaningless Match Error. We assume an UDF below; public class UDFToListString extends UDF { public ListString evaluate(Object o) { return Arrays.asList(xxx, yyy, zzz); } } An exception of scala.MatchError is thrown as follows when the UDF used; scala.MatchError: interface java.util.List (of class java.lang.Class) at org.apache.spark.sql.hive.HiveInspectors$class.javaClassToDataType(HiveInspectors.scala:174) at org.apache.spark.sql.hive.HiveSimpleUdf.javaClassToDataType(hiveUdfs.scala:76) at org.apache.spark.sql.hive.HiveSimpleUdf.dataType$lzycompute(hiveUdfs.scala:106) at org.apache.spark.sql.hive.HiveSimpleUdf.dataType(hiveUdfs.scala:106) at org.apache.spark.sql.catalyst.expressions.Alias.toAttribute(namedExpressions.scala:131) at org.apache.spark.sql.catalyst.planning.PhysicalOperation$$anonfun$collectAliases$1.applyOrElse(patterns.scala:95) at org.apache.spark.sql.catalyst.planning.PhysicalOperation$$anonfun$collectAliases$1.applyOrElse(patterns.scala:94) at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33) at scala.collection.TraversableLike$$anonfun$collect$1.apply(TraversableLike.scala:278) ... To make udf developers more understood, we need to throw a more suitable exception. was: The current implementation can't handle List as a return type in Hive UDF. We assume an UDF below; public class UDFToListString extends UDF { public ListString evaluate(Object o) { return Arrays.asList(xxx, yyy, zzz); } } An exception of scala.MatchError is thrown as follows when the UDF used; scala.MatchError: interface java.util.List (of class java.lang.Class) at org.apache.spark.sql.hive.HiveInspectors$class.javaClassToDataType(HiveInspectors.scala:174) at org.apache.spark.sql.hive.HiveSimpleUdf.javaClassToDataType(hiveUdfs.scala:76) at org.apache.spark.sql.hive.HiveSimpleUdf.dataType$lzycompute(hiveUdfs.scala:106) at org.apache.spark.sql.hive.HiveSimpleUdf.dataType(hiveUdfs.scala:106) at org.apache.spark.sql.catalyst.expressions.Alias.toAttribute(namedExpressions.scala:131) at org.apache.spark.sql.catalyst.planning.PhysicalOperation$$anonfun$collectAliases$1.applyOrElse(patterns.scala:95) at org.apache.spark.sql.catalyst.planning.PhysicalOperation$$anonfun$collectAliases$1.applyOrElse(patterns.scala:94) at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33) at scala.collection.TraversableLike$$anonfun$collect$1.apply(TraversableLike.scala:278) ... To fix this problem, we need to add an entry for List in HiveInspectors#javaClassToDataType. However, it has one difficulty because of type erasure in JVM. We assume that lines below are appended in HiveInspectors#javaClassToDataType; // list type case c: Class[_] if c == classOf[java.util.List[java.lang.Object]] = val tpe = c.getGenericInterfaces()(0).asInstanceOf[ParameterizedType] println(tpe.getActualTypeArguments()(0).toString()) = 'E' This logic fails to catch a component type in List. Throw an AnalysisException when unsupported Java list types used in Hive UDF Key: SPARK-6747 URL: https://issues.apache.org/jira/browse/SPARK-6747 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.4.0 Reporter: Takeshi Yamamuro The current implementation can't handle List as a return type in Hive UDF and throws meaningless Match Error. We assume an UDF below; public class UDFToListString extends UDF { public ListString evaluate(Object o) { return Arrays.asList(xxx, yyy, zzz); } } An exception of scala.MatchError is thrown as follows when the UDF used; scala.MatchError: interface java.util.List (of class java.lang.Class) at org.apache.spark.sql.hive.HiveInspectors$class.javaClassToDataType(HiveInspectors.scala:174) at org.apache.spark.sql.hive.HiveSimpleUdf.javaClassToDataType(hiveUdfs.scala:76) at org.apache.spark.sql.hive.HiveSimpleUdf.dataType$lzycompute(hiveUdfs.scala:106) at org.apache.spark.sql.hive.HiveSimpleUdf.dataType(hiveUdfs.scala:106) at org.apache.spark.sql.catalyst.expressions.Alias.toAttribute(namedExpressions.scala:131) at org.apache.spark.sql.catalyst.planning.PhysicalOperation$$anonfun$collectAliases$1.applyOrElse(patterns.scala:95) at
[jira] [Updated] (SPARK-6747) Throw an AnalysisException when unsupported Java list types used in Hive UDF
[ https://issues.apache.org/jira/browse/SPARK-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-6747: Shepherd: Michael Armbrust Assignee: Takeshi Yamamuro Throw an AnalysisException when unsupported Java list types used in Hive UDF Key: SPARK-6747 URL: https://issues.apache.org/jira/browse/SPARK-6747 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.4.0 Reporter: Takeshi Yamamuro Assignee: Takeshi Yamamuro The current implementation can't handle List as a return type in Hive UDF and throws meaningless Match Error. We assume an UDF below; public class UDFToListString extends UDF { public ListString evaluate(Object o) { return Arrays.asList(xxx, yyy, zzz); } } An exception of scala.MatchError is thrown as follows when the UDF used; scala.MatchError: interface java.util.List (of class java.lang.Class) at org.apache.spark.sql.hive.HiveInspectors$class.javaClassToDataType(HiveInspectors.scala:174) at org.apache.spark.sql.hive.HiveSimpleUdf.javaClassToDataType(hiveUdfs.scala:76) at org.apache.spark.sql.hive.HiveSimpleUdf.dataType$lzycompute(hiveUdfs.scala:106) at org.apache.spark.sql.hive.HiveSimpleUdf.dataType(hiveUdfs.scala:106) at org.apache.spark.sql.catalyst.expressions.Alias.toAttribute(namedExpressions.scala:131) at org.apache.spark.sql.catalyst.planning.PhysicalOperation$$anonfun$collectAliases$1.applyOrElse(patterns.scala:95) at org.apache.spark.sql.catalyst.planning.PhysicalOperation$$anonfun$collectAliases$1.applyOrElse(patterns.scala:94) at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33) at scala.collection.TraversableLike$$anonfun$collect$1.apply(TraversableLike.scala:278) ... To make udf developers more understood, we need to throw a more suitable exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-6747) Throw an AnalysisException when unsupported Java list types used in Hive UDF
[ https://issues.apache.org/jira/browse/SPARK-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro updated SPARK-6747: Summary: Throw an AnalysisException when unsupported Java list types used in Hive UDF (was: Support List as a return type in Hive UDF) Throw an AnalysisException when unsupported Java list types used in Hive UDF Key: SPARK-6747 URL: https://issues.apache.org/jira/browse/SPARK-6747 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.4.0 Reporter: Takeshi Yamamuro The current implementation can't handle List as a return type in Hive UDF. We assume an UDF below; public class UDFToListString extends UDF { public ListString evaluate(Object o) { return Arrays.asList(xxx, yyy, zzz); } } An exception of scala.MatchError is thrown as follows when the UDF used; scala.MatchError: interface java.util.List (of class java.lang.Class) at org.apache.spark.sql.hive.HiveInspectors$class.javaClassToDataType(HiveInspectors.scala:174) at org.apache.spark.sql.hive.HiveSimpleUdf.javaClassToDataType(hiveUdfs.scala:76) at org.apache.spark.sql.hive.HiveSimpleUdf.dataType$lzycompute(hiveUdfs.scala:106) at org.apache.spark.sql.hive.HiveSimpleUdf.dataType(hiveUdfs.scala:106) at org.apache.spark.sql.catalyst.expressions.Alias.toAttribute(namedExpressions.scala:131) at org.apache.spark.sql.catalyst.planning.PhysicalOperation$$anonfun$collectAliases$1.applyOrElse(patterns.scala:95) at org.apache.spark.sql.catalyst.planning.PhysicalOperation$$anonfun$collectAliases$1.applyOrElse(patterns.scala:94) at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33) at scala.collection.TraversableLike$$anonfun$collect$1.apply(TraversableLike.scala:278) ... To fix this problem, we need to add an entry for List in HiveInspectors#javaClassToDataType. However, it has one difficulty because of type erasure in JVM. We assume that lines below are appended in HiveInspectors#javaClassToDataType; // list type case c: Class[_] if c == classOf[java.util.List[java.lang.Object]] = val tpe = c.getGenericInterfaces()(0).asInstanceOf[ParameterizedType] println(tpe.getActualTypeArguments()(0).toString()) = 'E' This logic fails to catch a component type in List. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org