Hi Cesar, Can you try 1.3.1 ( https://spark.apache.org/releases/spark-release-1-3-1.html) and see if it still shows the error?
Thanks, Yin On Fri, Apr 17, 2015 at 1:58 PM, Reynold Xin <r...@databricks.com> wrote: > This is strange. cc the dev list since it might be a bug. > > > > On Thu, Apr 16, 2015 at 3:18 PM, Cesar Flores <ces...@gmail.com> wrote: > >> Never mind. I found the solution: >> >> val newDataFrame = hc.createDataFrame(hiveLoadedDataFrame.rdd, >> hiveLoadedDataFrame.schema) >> >> which translate to convert the data frame to rdd and back again to data >> frame. Not the prettiest solution, but at least it solves my problems. >> >> >> Thanks, >> Cesar Flores >> >> >> >> On Thu, Apr 16, 2015 at 11:17 AM, Cesar Flores <ces...@gmail.com> wrote: >> >>> >>> I have a data frame in which I load data from a hive table. And my issue >>> is that the data frame is missing the columns that I need to query. >>> >>> For example: >>> >>> val newdataset = dataset.where(dataset("label") === 1) >>> >>> gives me an error like the following: >>> >>> ERROR yarn.ApplicationMaster: User class threw exception: resolved >>> attributes label missing from label, user_id, ...(the rest of the fields of >>> my table >>> org.apache.spark.sql.AnalysisException: resolved attributes label >>> missing from label, user_id, ... (the rest of the fields of my table) >>> >>> where we can see that the label field actually exist. I manage to solve >>> this issue by updating my syntax to: >>> >>> val newdataset = dataset.where($"label" === 1) >>> >>> which works. However I can not make this trick in all my queries. For >>> example, when I try to do a unionAll from two subsets of the same data >>> frame the error I am getting is that all my fields are missing. >>> >>> Can someone tell me if I need to do some post processing after loading >>> from hive in order to avoid this kind of errors? >>> >>> >>> Thanks >>> -- >>> Cesar Flores >>> >> >> >> >> -- >> Cesar Flores >> > >