Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22569#discussion_r221147850 --- Diff: core/src/test/scala/org/apache/spark/util/collection/OpenHashSetSuite.scala --- @@ -255,4 +255,16 @@ class OpenHashSetSuite extends SparkFunSuite with Matchers { val set = new OpenHashSet[Long](0) assert(set.size === 0) } + + test("support for more than 12M items") { + val cnt = 12000000 // 12M + val set = new OpenHashSet[Int](cnt) + for (i <- 0 until cnt) { + set.add(i) + + val pos1 = set.addWithoutResize(i) & OpenHashSet.POSITION_MASK + val pos2 = set.getPos(i) + assert(pos1 == pos2) + } --- End diff -- oh, the original test performed the count check to see how many values are invalid. They are invalid values because the index is wrong due to wrong position mask in OpenHashSet. This rewritten test tests directly the index of OpenHashSet.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org