This is strange. cc the dev list since it might be a bug.
On Thu, Apr 16, 2015 at 3:18 PM, Cesar Flores <ces...@gmail.com> wrote: > Never mind. I found the solution: > > val newDataFrame = hc.createDataFrame(hiveLoadedDataFrame.rdd, > hiveLoadedDataFrame.schema) > > which translate to convert the data frame to rdd and back again to data > frame. Not the prettiest solution, but at least it solves my problems. > > > Thanks, > Cesar Flores > > > > On Thu, Apr 16, 2015 at 11:17 AM, Cesar Flores <ces...@gmail.com> wrote: > >> >> I have a data frame in which I load data from a hive table. And my issue >> is that the data frame is missing the columns that I need to query. >> >> For example: >> >> val newdataset = dataset.where(dataset("label") === 1) >> >> gives me an error like the following: >> >> ERROR yarn.ApplicationMaster: User class threw exception: resolved >> attributes label missing from label, user_id, ...(the rest of the fields of >> my table >> org.apache.spark.sql.AnalysisException: resolved attributes label missing >> from label, user_id, ... (the rest of the fields of my table) >> >> where we can see that the label field actually exist. I manage to solve >> this issue by updating my syntax to: >> >> val newdataset = dataset.where($"label" === 1) >> >> which works. However I can not make this trick in all my queries. For >> example, when I try to do a unionAll from two subsets of the same data >> frame the error I am getting is that all my fields are missing. >> >> Can someone tell me if I need to do some post processing after loading >> from hive in order to avoid this kind of errors? >> >> >> Thanks >> -- >> Cesar Flores >> > > > > -- > Cesar Flores >