A hack workaround is to use flatMap:

rdd.flatMap{ case (date, array) => for (x <- array) yield (date, x) }

For those of you who don't know Scala, the for comprehension iterates
through the ArrayBuffer, named "array" and yields new tuples with the date
and each element. The case expression to the left of the => pattern matches
on the input tuples.

Dean Wampler, Ph.D.
Author: Programming Scala, 2nd Edition
<http://shop.oreilly.com/product/0636920033073.do> (O'Reilly)
Typesafe <http://typesafe.com>
@deanwampler <http://twitter.com/deanwampler>
http://polyglotprogramming.com

On Thu, Apr 2, 2015 at 10:45 PM, Denny Lee <denny.g....@gmail.com> wrote:

> Thanks Michael - that was it!  I was drawing a blank on this one for some
> reason - much appreciated!
>
>
> On Thu, Apr 2, 2015 at 8:27 PM Michael Armbrust <mich...@databricks.com>
> wrote:
>
>> A lateral view explode using HiveQL.  I'm hopping to add explode
>> shorthand directly to the df API in 1.4.
>>
>> On Thu, Apr 2, 2015 at 7:10 PM, Denny Lee <denny.g....@gmail.com> wrote:
>>
>>> Quick question - the output of a dataframe is in the format of:
>>>
>>> [2015-04, ArrayBuffer(A, B, C, D)]
>>>
>>> and I'd like to return it as:
>>>
>>> 2015-04, A
>>> 2015-04, B
>>> 2015-04, C
>>> 2015-04, D
>>>
>>> What's the best way to do this?
>>>
>>> Thanks in advance!
>>>
>>>
>>>
>>

Reply via email to