Ken Ellinwood created SPARK-1591:
------------------------------------

             Summary: scala.MatchError executing custom UDTF
                 Key: SPARK-1591
                 URL: https://issues.apache.org/jira/browse/SPARK-1591
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 0.9.1
         Environment: CentOS 5, Hortonworks 1.3.2, Hadoop 1.2.0, Hive 0.11.0, 
Spark 0.9.1, Shark 0.9.1, sharkserver2, beeline
            Reporter: Ken Ellinwood
            Priority: Minor


My custom UDTF fails to execute in Shark even though it runs fine in Hive.

scala.MatchError: [orange, 1, Black, 419] (of class java.util.ArrayList)
    at scala.runtime.ScalaRunTime$.array_clone(ScalaRunTime.scala:118)
    at shark.execution.UDTFCollector.collect(UDTFOperator.scala:92)
    at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDTF.forward(GenericUDTF.java:91)
    at 
com.mycompany.warehouse.hive.HiveUdtfColorTreeTable.process(HiveUdtfColorTreeTable.java:98)
    at shark.execution.UDTFOperator.explode(UDTFOperator.scala:79)
    at 
shark.execution.LateralViewJoinOperator$$anonfun$processPartition$1.apply(LateralViewJoinOperator.scala:141)


The code at UDTFOperator.scala, line 92 is making two assumptions which are not 
true in my case.  First, it claims to need to clone the row object.  Second, it 
assumes all rows objects are arrays.  In my case the row is represented by 
ArrayList and does not need to be cloned because my UDTF creates a new one for 
each row already.   The clone operation fails because my row is not an array.

I changed my implementation to use an array, but we have a non-trivial number 
of custom UDFs that all work with Hive and I think they should work in Shark 
without modification.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to