Abdulla Al-Qawasmeh created SPARK-17145:
-------------------------------------------

             Summary: Object with many fields causes Seq Serialization Bug 
                 Key: SPARK-17145
                 URL: https://issues.apache.org/jira/browse/SPARK-17145
             Project: Spark
          Issue Type: Bug
    Affects Versions: 2.0.0
         Environment: OS: OSX El Capitan 10.11.6

            Reporter: Abdulla Al-Qawasmeh


The unit test here 
(https://gist.github.com/abdulla16/433faf7df59fce11a5fff284bac0d945) describes 
the problem. 

It looks like Spark is having problems serializing a Scala Seq when it's part 
of an object with many fields (I'm not 100% sure it's a serialization problem). 
The deserialized Seq ends up with as many items as the original Seq, however, 
all the items are copies of the last item in the original Seq.

The object that I used in my unit test (as an example) is a Tuple5. However, 
I've seen this behavior in other types of objects. 

Reducing MyClass5 to only two fields (field34 and field35) causes the unit test 
to pass. 





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to