GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/9030
[SPARK-10914] UnsafeRow serialization breaks when two machines have different Oops size. UnsafeRow contains 3 pieces of information when pointing to some data in memory (an object, a base offset, and length). When the row is serialized with Java/Kryo serialization, the object layout in memory can change if two machines have different pointer width (Oops in JVM). To reproduce, launch Spark using MASTER=local-cluster[2,1,1024] bin/spark-shell --conf "spark.executor.extraJavaOptions=-XX:-UseCompressedOops" And then run the following scala> sql("select 1 xx").collect() You can merge this pull request into a Git repository by running: $ git pull https://github.com/rxin/spark SPARK-10914 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/9030.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #9030 ---- commit 465fc8e18147b9e8cf34e0f5bcbc338d03ad4f95 Author: Reynold Xin <r...@databricks.com> Date: 2015-10-08T18:34:14Z [SPARK-10914] UnsafeRow serialization breaks when two machines have different Oops size. The problem is that UnsafeRow contains 3 pieces of information when pointing to some data in memory (an object, a base offset, and length). When the row is serialized with Java/Kryo serialization, the object layout in memory can change if two machines have different pointer width (Oops in JVM). To reproduce, launch Spark using MASTER=local-cluster[2,1,1024] bin/spark-shell --conf "spark.executor.extraJavaOptions=-XX:-UseCompressedOops" And then run the following scala> sql("select 1 xx").collect() (cherry picked from commit 157b2a818d3993b1321cc41fb7b30407bd13490b) Signed-off-by: Reynold Xin <r...@databricks.com> ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org