Sorry. Its 1.1.0.
After digging a bit more into this, it seems like the OpenCSV Deseralizer
converts all the columns to a String type. This maybe throwing the
execution off. Planning to create a class and map the rows to this custom
class. Will keep this thread updated.

On Wed, Oct 8, 2014 at 5:11 PM, Michael Armbrust <mich...@databricks.com>
wrote:

> Which version of Spark are you running?
>
> On Wed, Oct 8, 2014 at 4:18 PM, Ranga <sra...@gmail.com> wrote:
>
>> Thanks Michael. Should the cast be done in the source RDD or while doing
>> the SUM?
>> To give a better picture here is the code sequence:
>>
>> val sourceRdd = sql("select ... from source-hive-table")
>> sourceRdd.registerAsTable("sourceRDD")
>> val aggRdd = sql("select c1, c2, sum(c3) from sourceRDD group by c1, c2)
>>  // This query throws the exception when I collect the results
>>
>> I tried adding the cast to the aggRdd query above and that didn't help.
>>
>>
>> - Ranga
>>
>> On Wed, Oct 8, 2014 at 3:52 PM, Michael Armbrust <mich...@databricks.com>
>> wrote:
>>
>>> Using SUM on a string should automatically cast the column.  Also you
>>> can use CAST to change the datatype
>>> <https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-TypeConversionFunctions>
>>> .
>>>
>>> What version of Spark are you running?  This could be
>>> https://issues.apache.org/jira/browse/SPARK-1994
>>>
>>> On Wed, Oct 8, 2014 at 3:47 PM, Ranga <sra...@gmail.com> wrote:
>>>
>>>> Hi
>>>>
>>>> I am in the process of migrating some logic in pig scripts to
>>>> Spark-SQL. As part of this process, I am creating a few "Select...Group By"
>>>> query and registering them as tables using the SchemaRDD.registerAsTable
>>>> feature.
>>>> When using such a registered table in a subsequent "Select...Group By"
>>>> query, I get a "ClassCastException".
>>>> java.lang.ClassCastException: java.lang.String cannot be cast to
>>>> java.lang.Integer
>>>>
>>>> This happens when I use the "Sum" function on one of the columns. Is
>>>> there anyway to specify the data type for the columns when the
>>>> registerAsTable function is called? Are there other approaches that I
>>>> should be looking at?
>>>>
>>>> Thanks for your help.
>>>>
>>>>
>>>>
>>>> - Ranga
>>>>
>>>
>>>
>>
>

Reply via email to