Row(20, 11), Row(10, 2, 11))
>>>>> )
>>>>>
>>>>>
>>>>> 1st tuple is used to group the relevant records for aggregation. I
>>>>> have used following to create dataset.
>>>>>
>>>>> val s = StructType(Seq(
>>>>> StructField("x", IntegerType, true),
>>>>> StructField("y", IntegerType, true)
>>>>> ))
>>>>> val s1 = StructType(Seq(
>>>>> StructField("u", IntegerType, true),
>>>>> StructField("v", IntegerType, true),
>>>>> StructField("z", IntegerType, true)
>>>>> ))
>>>>>
>>>>> val ds =
>>>>> sparkSession.sqlContext.createDataset(sparkSession.sparkContext.parallelize(values))(Encoders.tuple(RowEncoder(s),
>>>>> RowEncoder(s1)))
>>>>>
>>>>> Is this correct way of representing this?
>>>>>
>>>>> How do I create dataset and row encoder for such use case for doing
>>>>> groupByKey on this?
>>>>>
>>>>>
>>>>>
>>>>> Regards
>>>>> Sandeep
>>>>>
>>>>
>>>>
ollowing to create dataset.
>>>
>>> val s = StructType(Seq(
>>> StructField("x", IntegerType, true),
>>> StructField("y", IntegerType, true)
>>> ))
>>> val s1 = StructType(Seq(
>>> StructField("u", Intege
; StructField("v", IntegerType, true),
>> StructField("z", IntegerType, true)
>> ))
>>
>> val ds =
>> sparkSession.sqlContext.createDataset(sparkSession.sparkContext.parallelize(values))(Encoders.tuple(RowEncoder(s),
>> RowEncoder(s1)))
>>
>> Is this correct way of representing this?
>>
>> How do I create dataset and row encoder for such use case for doing
>> groupByKey on this?
>>
>>
>>
>> Regards
>> Sandeep
>>
>
>
ctField("u", IntegerType, true),
> StructField("v", IntegerType, true),
> StructField("z", IntegerType, true)
> ))
>
> val ds =
> sparkSession.sqlContext.createDataset(sparkSession.sparkContext.parallelize(values))(Encoders.tuple(RowEn
Field("z", IntegerType, true)
))
val ds =
sparkSession.sqlContext.createDataset(sparkSession.sparkContext.parallelize(values))(Encoders.tuple(RowEncoder(s),
RowEncoder(s1)))
Is this correct way of representing this?
How do I create dataset and row encoder for such use case for doing
groupByKey on this?
Regards
Sandeep