Re: Dealing with missing columns in SPARK SQL in JSON

Aseem Bansal Tue, 14 Feb 2017 07:05:58 -0800

Sorry if I trivialized the example. It is the same kind of file and
sometimes it could have "a", sometimes "b", sometimes both. I just don't
know. That is what I meant by missing columns.


It would be good if I read any of the JSON and if I do spark sql and it
gave me

for json1.json

a | b
1 | null

for json2.json

a     | b
null | 2


On Tue, Feb 14, 2017 at 8:13 PM, Sam Elamin <hussam.ela...@gmail.com> wrote:

> I may be missing something super obvious here but can't you combine them
> into a single dataframe. Left join perhaps?
>
> Try writing it in sql " select a from json1 and b from josn2"then run
> explain to give you a hint to how to do it in code
>
> Regards
> Sam
> On Tue, 14 Feb 2017 at 14:30, Aseem Bansal <asmbans...@gmail.com> wrote:
>
>> Say I have two files containing single rows
>>
>> json1.json
>>
>> {"a": 1}
>>
>> json2.json
>>
>> {"b": 2}
>>
>> I read in this json file using spark's API into a dataframe one at a
>> time. So I have
>>
>> Dataset json1DF
>> and
>> Dataset json2DF
>>
>> If I run "select a, b from __THIS__" in a SQLTransformer then I will get
>> an exception as for json1DF does not have "b" and json2DF does not have "a"
>>
>> How could I handle this situation with missing columns in JSON?
>>
>

Re: Dealing with missing columns in SPARK SQL in JSON

Reply via email to