[ https://issues.apache.org/jira/browse/SPARK-7873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Patrick Wendell resolved SPARK-7873. ------------------------------------ Resolution: Fixed Fix Version/s: 1.4.0 > Serializer re-use + Kryo autoReset disabled leads to AraryIndexOutOfBounds > exception in sort-shuffle bypassMergeSort path > ------------------------------------------------------------------------------------------------------------------------- > > Key: SPARK-7873 > URL: https://issues.apache.org/jira/browse/SPARK-7873 > Project: Spark > Issue Type: Bug > Components: Shuffle, Spark Core > Affects Versions: 1.4.0 > Reporter: Josh Rosen > Assignee: Josh Rosen > Priority: Blocker > Fix For: 1.4.0 > > > This is a somewhat obscure bug, but I think that it will seriously impact > KryoSerializer users who use custom registrators which disabled auto-reset. > When auto-reset is disabled, then this breaks things in some of our shuffle > paths which actually end up creating multiple OutputStreams from the same > shared SerializerInstance (which is unsafe). To illustrate this, the > following test fails in 1.4: > {code} > class KryoSerializerAutoResetDisabledSuite extends FunSuite with > SharedSparkContext { > conf.set("spark.serializer", classOf[KryoSerializer].getName) > conf.set("spark.kryo.registrator", > classOf[RegistratorWithoutAutoReset].getName) > test("sort-shuffle with bypassMergeSort") { > val myObject = ("Hello", "World") > > assert(sc.parallelize(Seq.fill(100)(myObject)).repartition(2).collect().toSet > === Set(myObject)) > } > } > {code} > This was introduced by a patch (SPARK-3386) which enables serializer re-use > in some of the shuffle paths, since constructing new serializer instances is > actually pretty costly for KryoSerializer. We had already fixed another > corner-case (SPARK-7766) bug related to this, but missed this one. From an > engineering risk management perspective, we probably should have just > reverted the original serializer reuse patch and added a big > cross-product-of-configurations-and-shuffle-managers test suite before > attempting to fix the defects. > I think that I have a pretty simple fix for this, but we still might want to > consider a revert for 1.4 just to be safe. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org