subject:"How to combine all rows into a single row in DataFrame"

Re: How to combine all rows into a single row in DataFrame

2019-08-19 Thread Alex Landa

Hi, It sounds similar to what we do in our application. We don't serialize every row, but instead we group first the rows into the wanted representation and then apply protobuf serialization using map and lambda. I suggest not to serialize the entire DataFrame into a single protobuf message since

Re: How to combine all rows into a single row in DataFrame

2019-08-19 Thread Marcin Tustin

It sounds like you want to aggregate your rows in some way. I actually just wrote a blog post about that topic: https://medium.com/@albamus/spark-aggregating-your-data-the-fast-way-e37b53314fad On Mon, Aug 19, 2019 at 4:24 PM Rishikesh Gawade wrote: > *This Message originated outside your

How to combine all rows into a single row in DataFrame

2019-08-19 Thread Rishikesh Gawade

Hi All, I have been trying to serialize a dataframe in protobuf format. So far, I have been able to serialize every row of the dataframe by using map function and the logic for serialization within the same(within the lambda function). The resultant dataframe consists of rows in serialized