Re: Turning rows into columns

2017-02-11 Thread Paul Tremblay
Yes, that's what I need. Thanks. P. On 02/05/2017 12:17 PM, Koert Kuipers wrote: since there is no key to group by and assemble records i would suggest to write this in RDD land and then convert to data frame. you can use sc.wholeTextFiles to process text files and create a state machine O

Re: Turning rows into columns

2017-02-05 Thread Koert Kuipers
since there is no key to group by and assemble records i would suggest to write this in RDD land and then convert to data frame. you can use sc.wholeTextFiles to process text files and create a state machine On Feb 4, 2017 16:25, "Paul Tremblay" wrote: I am using pyspark 2.1 and am wondering how

Turning rows into columns

2017-02-04 Thread Paul Tremblay
I am using pyspark 2.1 and am wondering how to convert a flat file, with one record per row, into a columnar format. Here is an example of the data: u'WARC/1.0', u'WARC-Type: warcinfo', u'WARC-Date: 2016-12-08T13:00:23Z', u'WARC-Record-ID: ', u'Content-Length: 344', u'Content-Type: applicati