Sajeev Ramakrishnan created SPARK-22663:
-------------------------------------------
Summary: Spark DataSet to case class mapping mismatches
Key: SPARK-22663
URL: https://issues.apache.org/jira/browse/SPARK-22663
Project: Spark
Issue Type: Improvement
Components: Spark Core
Affects Versions: 2.2.0
Reporter: Sajeev Ramakrishnan
Priority: Minor
Dear Team,
As of now when we create a Dataset from a datasource, we give
as[<case-class>] at the end to do the mapping. But if the case class is having
an extra attribute, then spark throws error.
Eg.
case class MyClass(
var line: String = "",
var prevLine: String = ""
)
val raw= spark.read.textFile(<file>)
var a:Dataset[MyClass] = raw.withColumn("line", split(col("value"),
"\\t")).select(
col("line").getItem(0).as("line")
).as[MyClass]
This code fails telling that there is no match for prevLine
Fixing this would be easy to build spark programs with Datasets where so many
joins are involved and the result would add multiple columns everytime. It will
be difficult to have different case classes for different joins.
Thanks & Regards,
Sajeev Ramakrishnan
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]