Hi,
I have created a parquet-file from case-class using saveAsParquetFile
Then try to reload using parquetFile but it fails.
Sample code is attached.
Any help would be appreciated.
Regards,
Rahul
rahul@...
sample_parquet.sample_parquet
Tobias,
Understand and thanks for quick resolution of problem.
Thanks
~Rahul
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/serialization-issue-in-case-of-case-class-is-more-than-1-tp20334p20446.html
Sent from the Apache Spark User List mailing list
Is it a limitation that spark does not support more than one case class at a
time.
Regards,
Rahul
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/serialization-issue-in-case-of-case-class-is-more-than-1-tp20334p20415.html
Sent from the Apache Spark User
Hi Tobias,
Thanks Tobias for your response.
I have created objectfiles [person_obj,office_obj] from
csv[person_csv,office_csv] files using case classes[person,office] with API
(saveAsObjectFile)
Now I restarted spark-shell and load objectfiles using API(objectFile).
*Once any of one
Tobias,
Thanks for quick reply.
Definitely, after restart case classes need to be defined again.
I have done so thats why spark is able to load objectfile [e.g. person_obj]
and spark has maintained serialVersionUID [person_obj].
Next time when I am trying to load another objectfile [e.g.
Tobias,
Find csv and scala files and below are steps:
1. Copy csv files in current directory.
2. Open spark-shell from this directory.
3. Run one_scala file which will create object-files from csv-files in
current directory.
4. Restart spark-shell
5. a. Run two_scala file, while running it is
Hi,
I am newbie in Spark and performed following steps during POC execution:
1. Map csv file to object-file after some transformations once.
2. Serialize object-file to RDD for operation, as per need.
In case of 2 csv/object-files, first object-file is serialized to RDD
successfully but during