Hi Cesar,

Can you try 1.3.1 (
https://spark.apache.org/releases/spark-release-1-3-1.html) and see if it
still shows the error?

Thanks,

Yin

On Fri, Apr 17, 2015 at 1:58 PM, Reynold Xin <r...@databricks.com> wrote:

> This is strange. cc the dev list since it might be a bug.
>
>
>
> On Thu, Apr 16, 2015 at 3:18 PM, Cesar Flores <ces...@gmail.com> wrote:
>
>> Never mind. I found the solution:
>>
>> val newDataFrame = hc.createDataFrame(hiveLoadedDataFrame.rdd,
>> hiveLoadedDataFrame.schema)
>>
>> which translate to convert the data frame to rdd and back again to data
>> frame. Not the prettiest solution, but at least it solves my problems.
>>
>>
>> Thanks,
>> Cesar Flores
>>
>>
>>
>> On Thu, Apr 16, 2015 at 11:17 AM, Cesar Flores <ces...@gmail.com> wrote:
>>
>>>
>>> I have a data frame in which I load data from a hive table. And my issue
>>> is that the data frame is missing the columns that I need to query.
>>>
>>> For example:
>>>
>>> val newdataset = dataset.where(dataset("label") === 1)
>>>
>>> gives me an error like the following:
>>>
>>> ERROR yarn.ApplicationMaster: User class threw exception: resolved
>>> attributes label missing from label, user_id, ...(the rest of the fields of
>>> my table
>>> org.apache.spark.sql.AnalysisException: resolved attributes label
>>> missing from label, user_id, ... (the rest of the fields of my table)
>>>
>>> where we can see that the label field actually exist. I manage to solve
>>> this issue by updating my syntax to:
>>>
>>> val newdataset = dataset.where($"label" === 1)
>>>
>>> which works. However I can not make this trick in all my queries. For
>>> example, when I try to do a unionAll from two subsets of the same data
>>> frame the error I am getting is that all my fields are missing.
>>>
>>> Can someone tell me if I need to do some post processing after loading
>>> from hive in order to avoid this kind of errors?
>>>
>>>
>>> Thanks
>>> --
>>> Cesar Flores
>>>
>>
>>
>>
>> --
>> Cesar Flores
>>
>
>

Reply via email to