There is already an explode function on DataFrame btw

I think something like this would work. You might need to play with the

df.explode("arrayBufferColumn") { x => x }

On Fri, Apr 3, 2015 at 6:43 AM, Denny Lee <> wrote:

> Thanks Dean - fun hack :)
> On Fri, Apr 3, 2015 at 6:11 AM Dean Wampler <> wrote:
>> A hack workaround is to use flatMap:
>> rdd.flatMap{ case (date, array) => for (x <- array) yield (date, x) }
>> For those of you who don't know Scala, the for comprehension iterates
>> through the ArrayBuffer, named "array" and yields new tuples with the date
>> and each element. The case expression to the left of the => pattern matches
>> on the input tuples.
>> Dean Wampler, Ph.D.
>> Author: Programming Scala, 2nd Edition
>> <> (O'Reilly)
>> Typesafe <>
>> @deanwampler <>
>> On Thu, Apr 2, 2015 at 10:45 PM, Denny Lee <> wrote:
>>> Thanks Michael - that was it!  I was drawing a blank on this one for
>>> some reason - much appreciated!
>>> On Thu, Apr 2, 2015 at 8:27 PM Michael Armbrust <>
>>> wrote:
>>>> A lateral view explode using HiveQL.  I'm hopping to add explode
>>>> shorthand directly to the df API in 1.4.
>>>> On Thu, Apr 2, 2015 at 7:10 PM, Denny Lee <>
>>>> wrote:
>>>>> Quick question - the output of a dataframe is in the format of:
>>>>> [2015-04, ArrayBuffer(A, B, C, D)]
>>>>> and I'd like to return it as:
>>>>> 2015-04, A
>>>>> 2015-04, B
>>>>> 2015-04, C
>>>>> 2015-04, D
>>>>> What's the best way to do this?
>>>>> Thanks in advance!

Reply via email to