Dinesh Kumar created SPARK-32486:
------------------------------------
Summary: Issue with deserialization and persist api in latest
spark java versions
Key: SPARK-32486
URL: https://issues.apache.org/jira/browse/SPARK-32486
Project: Spark
Issue Type: Bug
Components: Java API
Affects Versions: 3.0.0, 2.4.6, 2.4.5, 2.4.4
Environment: It's happening on all the os and java8
Reporter: Dinesh Kumar
Fix For: 2.3.3
Hey Team, We have class level object instantiations in one of our Classes. When
we want to persist that data into the Dataset of this class Type it's not
persisting the null values instead it's taking class level precedence. i.e.
It's showing as new object.
Eg:
_Test.class has below class level attributes:_
_private Test1 testNumber = new Test1();_
_private Test2 testNumber2;_
String inputLocation = "src/test/resources/pipeline/test.parquet";
Dataset<Row> ds = this.session.read().parquet(inputLocation);
ds.printSchema();
ds.foreach(input->{
System.out.println(input); // When we verified it's showing testNumber,
testNumber2 as null
});
Dataset<Test> inputDataSet = ds.as(Encoders.bean(Test.class));
inputDataSet.foreach(input->{
System.out.println(input); // When we verified it's showing testNumber as new
Test1(), testNumber2 as null
});
This is the same issue with dataset.persist() call aswell. It is happening with
all 2.4.4 and higher versions. Can you please fix it?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]