Re: How to delete empty columns in df when writing to parquet?

Junfeng Chen Tue, 03 Apr 2018 20:05:10 -0700

You mean I should start two spark streaming application and read topics
respectively?



Regard,
Junfeng Chen

On Tue, Apr 3, 2018 at 10:31 PM, naresh Goud <nareshgoud.du...@gmail.com>
wrote:

> I don’t see any option other than staring two individual queries. It’s
> just a thought.
>
> Thank you,
> Naresh
>
> On Mon, Apr 2, 2018 at 10:29 PM Junfeng Chen <darou...@gmail.com> wrote:
>
>> I am trying to read data from kafka and writing them in parquet format
>> via Spark Streaming.
>> The problem is, the data from kafka are in variable data structure. For
>> example, app one has columns A,B,C, app two has columns B,C,D. So the data
>> frame I read from kafka has all columns ABCD. When I decide to write the
>> dataframe to parquet file partitioned with app name,
>> the parquet file of app one also contains columns D, where the columns D
>> is empty and it contains no data actually. So how to filter the empty
>> columns when I writing dataframe to parquet?
>>
>> Thanks!
>>
>>
>> Regard,
>> Junfeng Chen
>>
> --
> Thanks,
> Naresh
> www.linkedin.com/in/naresh-dulam
> http://hadoopandspark.blogspot.com/
>
>

Re: How to delete empty columns in df when writing to parquet?

Reply via email to