Hi ,

I have a use case,

where i need to merge three data set and build one where ever data is
available.

And my dataset is a complex object.

Customer
- name - string
- accounts - List<Account>

Account
- type - String
- Adressess - List<Address>

Address
-name - String

----

---


And it goes on.

These file are in parquet ,


All 3 input datasets are having some details , which need to merge.

And build one dataset , which has all the information ( i know the files
which need to merge )


I want to know , how should I proceed on this  ??

- my approach is to build case class of actual output and parse the three
dataset.
 ( but this is failing because the input response have not all the fields).

So basically , what should be the approach to deal this kind of problem ?

2nd , how can i convert parquet dataframe to dataset, considering the
pauquet struct does not have all the fields. but case class has all the
field ( i am getting error no struct type found)

Thanks
Manjay Kumar
8320 120 839

Reply via email to