Sorry. Its 1.1.0. After digging a bit more into this, it seems like the OpenCSV Deseralizer converts all the columns to a String type. This maybe throwing the execution off. Planning to create a class and map the rows to this custom class. Will keep this thread updated.
On Wed, Oct 8, 2014 at 5:11 PM, Michael Armbrust <mich...@databricks.com> wrote: > Which version of Spark are you running? > > On Wed, Oct 8, 2014 at 4:18 PM, Ranga <sra...@gmail.com> wrote: > >> Thanks Michael. Should the cast be done in the source RDD or while doing >> the SUM? >> To give a better picture here is the code sequence: >> >> val sourceRdd = sql("select ... from source-hive-table") >> sourceRdd.registerAsTable("sourceRDD") >> val aggRdd = sql("select c1, c2, sum(c3) from sourceRDD group by c1, c2) >> // This query throws the exception when I collect the results >> >> I tried adding the cast to the aggRdd query above and that didn't help. >> >> >> - Ranga >> >> On Wed, Oct 8, 2014 at 3:52 PM, Michael Armbrust <mich...@databricks.com> >> wrote: >> >>> Using SUM on a string should automatically cast the column. Also you >>> can use CAST to change the datatype >>> <https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-TypeConversionFunctions> >>> . >>> >>> What version of Spark are you running? This could be >>> https://issues.apache.org/jira/browse/SPARK-1994 >>> >>> On Wed, Oct 8, 2014 at 3:47 PM, Ranga <sra...@gmail.com> wrote: >>> >>>> Hi >>>> >>>> I am in the process of migrating some logic in pig scripts to >>>> Spark-SQL. As part of this process, I am creating a few "Select...Group By" >>>> query and registering them as tables using the SchemaRDD.registerAsTable >>>> feature. >>>> When using such a registered table in a subsequent "Select...Group By" >>>> query, I get a "ClassCastException". >>>> java.lang.ClassCastException: java.lang.String cannot be cast to >>>> java.lang.Integer >>>> >>>> This happens when I use the "Sum" function on one of the columns. Is >>>> there anyway to specify the data type for the columns when the >>>> registerAsTable function is called? Are there other approaches that I >>>> should be looking at? >>>> >>>> Thanks for your help. >>>> >>>> >>>> >>>> - Ranga >>>> >>> >>> >> >