Eyal Farago created ARROW-7837: ---------------------------------- Summary: bug in BaseVariableWidthVector.copyFromSafe results with an index out of bounds exception Key: ARROW-7837 URL: https://issues.apache.org/jira/browse/ARROW-7837 Project: Apache Arrow Issue Type: Improvement Components: Java Affects Versions: 0.15.0 Reporter: Eyal Farago
There's a subtle bug in the copySafe method of BaseVariableWidthVector that results with an index out of bounds exception. The issue is somewhere between the safeCopy and handleSafe methods, copySafe calls handleSafe in order to assure underlying buffers capacity before appending a value to the vector, however the handleSafe method falsely assumes all 'holes' have been field when checking the next write offset. as a result it reads a stale offset (I believe it's 0 for freshly allocated buffers but may be un-guaranteed when reusing a buffer) and fails to identify the need to resize the values buffer. the following (scala) test demonstrates the issue (by artificially shrinking the values buffer). it was written after we've hit this in production: {code:java} test("try to reproduce Arrow issue"){ val charVector = new VarCharVector("stam", Allocator.get) val srcCharVector = new VarCharVector("src", Allocator.get) srcCharVector.setSafe(0, Array.tabulate(20)(_.toByte)) srcCharVector.setValueCount(2) for( i <- 0 until 4){ charVector.copyFromSafe(0, i, srcCharVector) charVector.setValueCount(i + 1) } val valBuff = charVector.getBuffers(false)(2) valBuff.capacity(90) charVector.copyFromSafe(0, 14, srcCharVector) srcCharVector.close() charVector.close() } {code} this test fails with the following exception: {code:java} index: 80, length: 20 (expected: range(0, 90)) java.lang.IndexOutOfBoundsException: index: 80, length: 20 (expected: range(0, 90)) at io.netty.buffer.ArrowBuf.getBytes(ArrowBuf.java:929) at org.apache.arrow.vector.BaseVariableWidthVector.copyFromSafe(BaseVariableWidthVector.java:1345) at com.datorama.pluto.arrow.ArroStreamSerializationTest.$anonfun$new$33(ArroStreamSerializationTest.scala:454) at com.datorama.pluto.arrow.ArroStreamSerializationTest$$Lambda$129.00000000F78CFE20.apply$mcV$sp(Unknown Source) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12) at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85) at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83) at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) at org.scalatest.Transformer.apply(Transformer.scala:22) at org.scalatest.Transformer.apply(Transformer.scala:20) at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186) at org.scalatest.TestSuite.withFixture(TestSuite.scala:196) at org.scalatest.TestSuite.withFixture$(TestSuite.scala:195) at org.scalatest.FunSuite.withFixture(FunSuite.scala:1560) at org.scalatest.FunSuiteLike.invokeWithFixture$1(FunSuiteLike.scala:184) at org.scalatest.FunSuiteLike.$anonfun$runTest$1(FunSuiteLike.scala:196) at org.scalatest.FunSuiteLike$$Lambda$367.00000000001B9220.apply(Unknown Source) at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289) at org.scalatest.FunSuiteLike.runTest(FunSuiteLike.scala:196) at org.scalatest.FunSuiteLike.runTest$(FunSuiteLike.scala:178) at com.datorama.pluto.arrow.ArroStreamSerializationTest.org$scalatest$BeforeAndAfterEachTestData$$super$runTest(ArroStreamSerializationTest.scala:32) at org.scalatest.BeforeAndAfterEachTestData.runTest(BeforeAndAfterEachTestData.scala:194) at org.scalatest.BeforeAndAfterEachTestData.runTest$(BeforeAndAfterEachTestData.scala:187) at com.datorama.pluto.arrow.ArroStreamSerializationTest.runTest(ArroStreamSerializationTest.scala:32) at org.scalatest.FunSuiteLike.$anonfun$runTests$1(FunSuiteLike.scala:229) at org.scalatest.FunSuiteLike$$Lambda$358.000000001AAC0020.apply(Unknown Source) at org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:396) at org.scalatest.SuperEngine$$Lambda$359.000000001AAC0820.apply(Unknown Source) at scala.collection.immutable.List.foreach(List.scala:388) at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384) at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:379) at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:461) at org.scalatest.FunSuiteLike.runTests(FunSuiteLike.scala:229) at org.scalatest.FunSuiteLike.runTests$(FunSuiteLike.scala:228) at org.scalatest.FunSuite.runTests(FunSuite.scala:1560) at org.scalatest.Suite.run(Suite.scala:1147) at org.scalatest.Suite.run$(Suite.scala:1129) at org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1560) at org.scalatest.FunSuiteLike.$anonfun$run$1(FunSuiteLike.scala:233) at org.scalatest.FunSuiteLike$$Lambda$352.0000000019149C20.apply(Unknown Source) at org.scalatest.SuperEngine.runImpl(Engine.scala:521) at org.scalatest.FunSuiteLike.run(FunSuiteLike.scala:233) at org.scalatest.FunSuiteLike.run$(FunSuiteLike.scala:232) at com.datorama.pluto.arrow.ArroStreamSerializationTest.org$scalatest$BeforeAndAfterAll$$super$run(ArroStreamSerializationTest.scala:32) at org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:213) at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210) at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208) at com.datorama.pluto.arrow.ArroStreamSerializationTest.run(ArroStreamSerializationTest.scala:32) at org.scalatest.tools.SuiteRunner.run(SuiteRunner.scala:45) at org.scalatest.tools.Runner$.$anonfun$doRunRunRunDaDoRunRun$13(Runner.scala:1346) at org.scalatest.tools.Runner$.$anonfun$doRunRunRunDaDoRunRun$13$adapted(Runner.scala:1340) at org.scalatest.tools.Runner$$$Lambda$164.0000000017957020.apply(Unknown Source) at scala.collection.immutable.List.foreach(List.scala:388) at org.scalatest.tools.Runner$.doRunRunRunDaDoRunRun(Runner.scala:1340) at org.scalatest.tools.Runner$.$anonfun$runOptionallyWithPassFailReporter$24(Runner.scala:1031) at org.scalatest.tools.Runner$.$anonfun$runOptionallyWithPassFailReporter$24$adapted(Runner.scala:1010) at org.scalatest.tools.Runner$$$Lambda$78.000000001B0D5820.apply(Unknown Source) at org.scalatest.tools.Runner$.withClassLoaderAndDispatchReporter(Runner.scala:1506) at org.scalatest.tools.Runner$.runOptionallyWithPassFailReporter(Runner.scala:1010) at org.scalatest.tools.Runner$.run(Runner.scala:850) at org.scalatest.tools.Runner.run(Runner.scala) at org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner.runScalaTest2(ScalaTestRunner.java:133) at org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner.main(ScalaTestRunner.java:27) {code} I believe the root cause for this bugs is in [this line|https://github.com/apache/arrow/blob/apache-arrow-0.15.0/java/vector/src/main/java/org/apache/arrow/vector/BaseVariableWidthVector.java#L1237] in the handleSafe method: {code:java} final int startOffset = getStartOffset(index); {code} we've encountered this bug in dremio's HashJoinOperator, where a loop has two cases: in one case it appends to one vector and in the other case it appends to another, when there are 'holes' in this loop it ends up calling copySafe with an index which is several slots away from the last update, in most cases this goes well but it occasionally (quite rare, but happens) misses the need to resize the values buffer. will you be willing to accept a pull request fixing this issue? -- This message was sent by Atlassian Jira (v8.3.4#803005)