Sorry if I trivialized the example. It is the same kind of file and sometimes it could have "a", sometimes "b", sometimes both. I just don't know. That is what I meant by missing columns.
It would be good if I read any of the JSON and if I do spark sql and it gave me for json1.json a | b 1 | null for json2.json a | b null | 2 On Tue, Feb 14, 2017 at 8:13 PM, Sam Elamin <hussam.ela...@gmail.com> wrote: > I may be missing something super obvious here but can't you combine them > into a single dataframe. Left join perhaps? > > Try writing it in sql " select a from json1 and b from josn2"then run > explain to give you a hint to how to do it in code > > Regards > Sam > On Tue, 14 Feb 2017 at 14:30, Aseem Bansal <asmbans...@gmail.com> wrote: > >> Say I have two files containing single rows >> >> json1.json >> >> {"a": 1} >> >> json2.json >> >> {"b": 2} >> >> I read in this json file using spark's API into a dataframe one at a >> time. So I have >> >> Dataset json1DF >> and >> Dataset json2DF >> >> If I run "select a, b from __THIS__" in a SQLTransformer then I will get >> an exception as for json1DF does not have "b" and json2DF does not have "a" >> >> How could I handle this situation with missing columns in JSON? >> >