[jira] [Updated] (SPARK-6747) Throw an AnalysisException when unsupported Java list types used in Hive UDF

2015-07-06 Thread Takeshi Yamamuro (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-6747:

Description: 
The current implementation can't handle List as a return type in Hive UDF and
throws meaningless Match Error.
We assume an UDF below;

public class UDFToListString extends UDF {
public ListString evaluate(Object o) {
return Arrays.asList(xxx, yyy, zzz);
}
}

An exception of scala.MatchError is thrown as follows when the UDF used;

scala.MatchError: interface java.util.List (of class java.lang.Class)
at 
org.apache.spark.sql.hive.HiveInspectors$class.javaClassToDataType(HiveInspectors.scala:174)
at 
org.apache.spark.sql.hive.HiveSimpleUdf.javaClassToDataType(hiveUdfs.scala:76)
at 
org.apache.spark.sql.hive.HiveSimpleUdf.dataType$lzycompute(hiveUdfs.scala:106)
at org.apache.spark.sql.hive.HiveSimpleUdf.dataType(hiveUdfs.scala:106)
at 
org.apache.spark.sql.catalyst.expressions.Alias.toAttribute(namedExpressions.scala:131)
at 
org.apache.spark.sql.catalyst.planning.PhysicalOperation$$anonfun$collectAliases$1.applyOrElse(patterns.scala:95)
at 
org.apache.spark.sql.catalyst.planning.PhysicalOperation$$anonfun$collectAliases$1.applyOrElse(patterns.scala:94)
at 
scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
at 
scala.collection.TraversableLike$$anonfun$collect$1.apply(TraversableLike.scala:278)
...

To make udf developers more understood, we need to throw a more suitable 
exception.

  was:
The current implementation can't handle List as a return type in Hive UDF.
We assume an UDF below;

public class UDFToListString extends UDF {
public ListString evaluate(Object o) {
return Arrays.asList(xxx, yyy, zzz);
}
}

An exception of scala.MatchError is thrown as follows when the UDF used;

scala.MatchError: interface java.util.List (of class java.lang.Class)
at 
org.apache.spark.sql.hive.HiveInspectors$class.javaClassToDataType(HiveInspectors.scala:174)
at 
org.apache.spark.sql.hive.HiveSimpleUdf.javaClassToDataType(hiveUdfs.scala:76)
at 
org.apache.spark.sql.hive.HiveSimpleUdf.dataType$lzycompute(hiveUdfs.scala:106)
at org.apache.spark.sql.hive.HiveSimpleUdf.dataType(hiveUdfs.scala:106)
at 
org.apache.spark.sql.catalyst.expressions.Alias.toAttribute(namedExpressions.scala:131)
at 
org.apache.spark.sql.catalyst.planning.PhysicalOperation$$anonfun$collectAliases$1.applyOrElse(patterns.scala:95)
at 
org.apache.spark.sql.catalyst.planning.PhysicalOperation$$anonfun$collectAliases$1.applyOrElse(patterns.scala:94)
at 
scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
at 
scala.collection.TraversableLike$$anonfun$collect$1.apply(TraversableLike.scala:278)
...

To fix this problem, we need to add an entry for List in 
HiveInspectors#javaClassToDataType.
However, it has one difficulty because of type erasure in JVM.
We assume that lines below are appended in HiveInspectors#javaClassToDataType;

// list type
case c: Class[_] if c == classOf[java.util.List[java.lang.Object]] =
val tpe = c.getGenericInterfaces()(0).asInstanceOf[ParameterizedType]
println(tpe.getActualTypeArguments()(0).toString()) = 'E'

This logic fails to catch a component type in List.


 Throw an AnalysisException when unsupported Java list types used in Hive UDF
 

 Key: SPARK-6747
 URL: https://issues.apache.org/jira/browse/SPARK-6747
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.4.0
Reporter: Takeshi Yamamuro

 The current implementation can't handle List as a return type in Hive UDF 
 and
 throws meaningless Match Error.
 We assume an UDF below;
 public class UDFToListString extends UDF {
 public ListString evaluate(Object o) {
 return Arrays.asList(xxx, yyy, zzz);
 }
 }
 An exception of scala.MatchError is thrown as follows when the UDF used;
 scala.MatchError: interface java.util.List (of class java.lang.Class)
   at 
 org.apache.spark.sql.hive.HiveInspectors$class.javaClassToDataType(HiveInspectors.scala:174)
   at 
 org.apache.spark.sql.hive.HiveSimpleUdf.javaClassToDataType(hiveUdfs.scala:76)
   at 
 org.apache.spark.sql.hive.HiveSimpleUdf.dataType$lzycompute(hiveUdfs.scala:106)
   at org.apache.spark.sql.hive.HiveSimpleUdf.dataType(hiveUdfs.scala:106)
   at 
 org.apache.spark.sql.catalyst.expressions.Alias.toAttribute(namedExpressions.scala:131)
   at 
 org.apache.spark.sql.catalyst.planning.PhysicalOperation$$anonfun$collectAliases$1.applyOrElse(patterns.scala:95)
   at 
 

[jira] [Updated] (SPARK-6747) Throw an AnalysisException when unsupported Java list types used in Hive UDF

2015-07-06 Thread Michael Armbrust (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Armbrust updated SPARK-6747:

Shepherd: Michael Armbrust
Assignee: Takeshi Yamamuro

 Throw an AnalysisException when unsupported Java list types used in Hive UDF
 

 Key: SPARK-6747
 URL: https://issues.apache.org/jira/browse/SPARK-6747
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.4.0
Reporter: Takeshi Yamamuro
Assignee: Takeshi Yamamuro

 The current implementation can't handle List as a return type in Hive UDF 
 and
 throws meaningless Match Error.
 We assume an UDF below;
 public class UDFToListString extends UDF {
 public ListString evaluate(Object o) {
 return Arrays.asList(xxx, yyy, zzz);
 }
 }
 An exception of scala.MatchError is thrown as follows when the UDF used;
 scala.MatchError: interface java.util.List (of class java.lang.Class)
   at 
 org.apache.spark.sql.hive.HiveInspectors$class.javaClassToDataType(HiveInspectors.scala:174)
   at 
 org.apache.spark.sql.hive.HiveSimpleUdf.javaClassToDataType(hiveUdfs.scala:76)
   at 
 org.apache.spark.sql.hive.HiveSimpleUdf.dataType$lzycompute(hiveUdfs.scala:106)
   at org.apache.spark.sql.hive.HiveSimpleUdf.dataType(hiveUdfs.scala:106)
   at 
 org.apache.spark.sql.catalyst.expressions.Alias.toAttribute(namedExpressions.scala:131)
   at 
 org.apache.spark.sql.catalyst.planning.PhysicalOperation$$anonfun$collectAliases$1.applyOrElse(patterns.scala:95)
   at 
 org.apache.spark.sql.catalyst.planning.PhysicalOperation$$anonfun$collectAliases$1.applyOrElse(patterns.scala:94)
   at 
 scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
   at 
 scala.collection.TraversableLike$$anonfun$collect$1.apply(TraversableLike.scala:278)
 ...
 To make udf developers more understood, we need to throw a more suitable 
 exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-6747) Throw an AnalysisException when unsupported Java list types used in Hive UDF

2015-07-06 Thread Takeshi Yamamuro (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-6747:

Summary: Throw an AnalysisException when unsupported Java list types used 
in Hive UDF  (was: Support List as a return type in Hive UDF)

 Throw an AnalysisException when unsupported Java list types used in Hive UDF
 

 Key: SPARK-6747
 URL: https://issues.apache.org/jira/browse/SPARK-6747
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.4.0
Reporter: Takeshi Yamamuro

 The current implementation can't handle List as a return type in Hive UDF.
 We assume an UDF below;
 public class UDFToListString extends UDF {
 public ListString evaluate(Object o) {
 return Arrays.asList(xxx, yyy, zzz);
 }
 }
 An exception of scala.MatchError is thrown as follows when the UDF used;
 scala.MatchError: interface java.util.List (of class java.lang.Class)
   at 
 org.apache.spark.sql.hive.HiveInspectors$class.javaClassToDataType(HiveInspectors.scala:174)
   at 
 org.apache.spark.sql.hive.HiveSimpleUdf.javaClassToDataType(hiveUdfs.scala:76)
   at 
 org.apache.spark.sql.hive.HiveSimpleUdf.dataType$lzycompute(hiveUdfs.scala:106)
   at org.apache.spark.sql.hive.HiveSimpleUdf.dataType(hiveUdfs.scala:106)
   at 
 org.apache.spark.sql.catalyst.expressions.Alias.toAttribute(namedExpressions.scala:131)
   at 
 org.apache.spark.sql.catalyst.planning.PhysicalOperation$$anonfun$collectAliases$1.applyOrElse(patterns.scala:95)
   at 
 org.apache.spark.sql.catalyst.planning.PhysicalOperation$$anonfun$collectAliases$1.applyOrElse(patterns.scala:94)
   at 
 scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
   at 
 scala.collection.TraversableLike$$anonfun$collect$1.apply(TraversableLike.scala:278)
 ...
 To fix this problem, we need to add an entry for List in 
 HiveInspectors#javaClassToDataType.
 However, it has one difficulty because of type erasure in JVM.
 We assume that lines below are appended in HiveInspectors#javaClassToDataType;
 // list type
 case c: Class[_] if c == classOf[java.util.List[java.lang.Object]] =
 val tpe = c.getGenericInterfaces()(0).asInstanceOf[ParameterizedType]
 println(tpe.getActualTypeArguments()(0).toString()) = 'E'
 This logic fails to catch a component type in List.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org