oh yes, this was by accident, it should have gone to dev
On Wed, May 25, 2016 at 4:20 PM, Reynold Xin wrote:
> Created JIRA ticket: https://issues.apache.org/jira/browse/SPARK-15533
>
> @Koert - Please keep API feedback coming. One thing - in the future, can
> you send api
Created JIRA ticket: https://issues.apache.org/jira/browse/SPARK-15533
@Koert - Please keep API feedback coming. One thing - in the future, can
you send api feedbacks to the dev@ list instead of user@?
On Wed, May 25, 2016 at 1:05 PM, Cheng Lian wrote:
> Agree, since
Agree, since they can be easily replaced by .flatMap (to do explosion)
and .select (to rename output columns)
Cheng
On 5/25/16 12:30 PM, Reynold Xin wrote:
Based on this discussion I'm thinking we should deprecate the two
explode functions.
On Wednesday, May 25, 2016, Koert Kuipers
Based on this discussion I'm thinking we should deprecate the two explode
functions.
On Wednesday, May 25, 2016, Koert Kuipers wrote:
> wenchen,
> that definition of explode seems identical to flatMap, so you dont need it
> either?
>
> michael,
> i didn't know about the
wenchen,
that definition of explode seems identical to flatMap, so you dont need it
either?
michael,
i didn't know about the column expression version of explode, that makes
sense. i will experiment with that instead.
On Wed, May 25, 2016 at 3:03 PM, Wenchen Fan wrote:
These APIs predate Datasets / encoders, so that is why they are Row instead
of objects. We should probably rethink that.
Honestly, I usually end up using the column expression version of explode
now that it exists (i.e. explode($"arrayCol").as("Item")). It would be
great to understand more why
we currently have 2 explode definitions in Dataset:
def explode[A <: Product : TypeTag](input: Column*)(f: Row =>
TraversableOnce[A]): DataFrame
def explode[A, B : TypeTag](inputColumn: String, outputColumn: String)(f:
A => TraversableOnce[B]): DataFrame
1) the separation of the functions