Hi, Thanks for explaining!
Regard, Junfeng Chen On Wed, Apr 4, 2018 at 7:43 PM, Gourav Sengupta <gourav.sengu...@gmail.com> wrote: > Hi, > > I do not think that in a columnar database it makes much of a difference. > The amount of data that you will be parsing will not be much anyways. > > Regards, > Gourav Sengupta > > On Wed, Apr 4, 2018 at 11:02 AM, Junfeng Chen <darou...@gmail.com> wrote: > >> Our users ask for it.... >> >> >> Regard, >> Junfeng Chen >> >> On Wed, Apr 4, 2018 at 5:45 PM, Gourav Sengupta < >> gourav.sengu...@gmail.com> wrote: >> >>> Hi Junfeng, >>> >>> can I ask why it is important to remove the empty column? >>> >>> Regards, >>> Gourav Sengupta >>> >>> On Tue, Apr 3, 2018 at 4:28 AM, Junfeng Chen <darou...@gmail.com> wrote: >>> >>>> I am trying to read data from kafka and writing them in parquet format >>>> via Spark Streaming. >>>> The problem is, the data from kafka are in variable data structure. For >>>> example, app one has columns A,B,C, app two has columns B,C,D. So the data >>>> frame I read from kafka has all columns ABCD. When I decide to write the >>>> dataframe to parquet file partitioned with app name, >>>> the parquet file of app one also contains columns D, where the columns >>>> D is empty and it contains no data actually. So how to filter the empty >>>> columns when I writing dataframe to parquet? >>>> >>>> Thanks! >>>> >>>> >>>> Regard, >>>> Junfeng Chen >>>> >>> >>> >> >