Eyal Farago created ARROW-7837:
----------------------------------

             Summary: bug in BaseVariableWidthVector.copyFromSafe results with 
an index out of bounds exception
                 Key: ARROW-7837
                 URL: https://issues.apache.org/jira/browse/ARROW-7837
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Java
    Affects Versions: 0.15.0
            Reporter: Eyal Farago


There's a subtle bug in the copySafe method of BaseVariableWidthVector that 
results with an index out of bounds exception.

The issue is somewhere between the safeCopy and handleSafe methods,

copySafe calls handleSafe in order to assure underlying buffers capacity before 
appending a value to the vector, however the handleSafe method falsely assumes 
all 'holes' have been field when checking the next write offset. as a result it 
reads a stale offset (I believe it's 0 for freshly allocated buffers but may be 
un-guaranteed when reusing a buffer) and fails to identify the need to resize 
the values buffer.

 

the following (scala) test demonstrates the issue (by artificially shrinking 
the values buffer). it was written after we've hit this in production:
{code:java}
test("try to reproduce Arrow issue"){
    val charVector = new VarCharVector("stam", Allocator.get)

    val srcCharVector = new VarCharVector("src", Allocator.get)
    srcCharVector.setSafe(0, Array.tabulate(20)(_.toByte))
    srcCharVector.setValueCount(2)

    for( i <- 0 until 4){
      charVector.copyFromSafe(0, i, srcCharVector)
      charVector.setValueCount(i + 1)
    }

    val valBuff = charVector.getBuffers(false)(2)
    valBuff.capacity(90)

    charVector.copyFromSafe(0, 14, srcCharVector)

    srcCharVector.close()
    charVector.close()
  }
{code}
 this test fails with the following exception:

 
{code:java}
index: 80, length: 20 (expected: range(0, 90))
java.lang.IndexOutOfBoundsException: index: 80, length: 20 (expected: range(0, 
90))
        at io.netty.buffer.ArrowBuf.getBytes(ArrowBuf.java:929)
        at 
org.apache.arrow.vector.BaseVariableWidthVector.copyFromSafe(BaseVariableWidthVector.java:1345)
        at 
com.datorama.pluto.arrow.ArroStreamSerializationTest.$anonfun$new$33(ArroStreamSerializationTest.scala:454)
        at 
com.datorama.pluto.arrow.ArroStreamSerializationTest$$Lambda$129.00000000F78CFE20.apply$mcV$sp(Unknown
 Source)
        at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
        at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
        at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
        at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
        at org.scalatest.Transformer.apply(Transformer.scala:22)
        at org.scalatest.Transformer.apply(Transformer.scala:20)
        at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
        at org.scalatest.TestSuite.withFixture(TestSuite.scala:196)
        at org.scalatest.TestSuite.withFixture$(TestSuite.scala:195)
        at org.scalatest.FunSuite.withFixture(FunSuite.scala:1560)
        at 
org.scalatest.FunSuiteLike.invokeWithFixture$1(FunSuiteLike.scala:184)
        at org.scalatest.FunSuiteLike.$anonfun$runTest$1(FunSuiteLike.scala:196)
        at 
org.scalatest.FunSuiteLike$$Lambda$367.00000000001B9220.apply(Unknown Source)
        at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289)
        at org.scalatest.FunSuiteLike.runTest(FunSuiteLike.scala:196)
        at org.scalatest.FunSuiteLike.runTest$(FunSuiteLike.scala:178)
        at 
com.datorama.pluto.arrow.ArroStreamSerializationTest.org$scalatest$BeforeAndAfterEachTestData$$super$runTest(ArroStreamSerializationTest.scala:32)
        at 
org.scalatest.BeforeAndAfterEachTestData.runTest(BeforeAndAfterEachTestData.scala:194)
        at 
org.scalatest.BeforeAndAfterEachTestData.runTest$(BeforeAndAfterEachTestData.scala:187)
        at 
com.datorama.pluto.arrow.ArroStreamSerializationTest.runTest(ArroStreamSerializationTest.scala:32)
        at 
org.scalatest.FunSuiteLike.$anonfun$runTests$1(FunSuiteLike.scala:229)
        at 
org.scalatest.FunSuiteLike$$Lambda$358.000000001AAC0020.apply(Unknown Source)
        at 
org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:396)
        at org.scalatest.SuperEngine$$Lambda$359.000000001AAC0820.apply(Unknown 
Source)
        at scala.collection.immutable.List.foreach(List.scala:388)
        at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384)
        at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:379)
        at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:461)
        at org.scalatest.FunSuiteLike.runTests(FunSuiteLike.scala:229)
        at org.scalatest.FunSuiteLike.runTests$(FunSuiteLike.scala:228)
        at org.scalatest.FunSuite.runTests(FunSuite.scala:1560)
        at org.scalatest.Suite.run(Suite.scala:1147)
        at org.scalatest.Suite.run$(Suite.scala:1129)
        at 
org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1560)
        at org.scalatest.FunSuiteLike.$anonfun$run$1(FunSuiteLike.scala:233)
        at 
org.scalatest.FunSuiteLike$$Lambda$352.0000000019149C20.apply(Unknown Source)
        at org.scalatest.SuperEngine.runImpl(Engine.scala:521)
        at org.scalatest.FunSuiteLike.run(FunSuiteLike.scala:233)
        at org.scalatest.FunSuiteLike.run$(FunSuiteLike.scala:232)
        at 
com.datorama.pluto.arrow.ArroStreamSerializationTest.org$scalatest$BeforeAndAfterAll$$super$run(ArroStreamSerializationTest.scala:32)
        at 
org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:213)
        at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210)
        at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208)
        at 
com.datorama.pluto.arrow.ArroStreamSerializationTest.run(ArroStreamSerializationTest.scala:32)
        at org.scalatest.tools.SuiteRunner.run(SuiteRunner.scala:45)
        at 
org.scalatest.tools.Runner$.$anonfun$doRunRunRunDaDoRunRun$13(Runner.scala:1346)
        at 
org.scalatest.tools.Runner$.$anonfun$doRunRunRunDaDoRunRun$13$adapted(Runner.scala:1340)
        at 
org.scalatest.tools.Runner$$$Lambda$164.0000000017957020.apply(Unknown Source)
        at scala.collection.immutable.List.foreach(List.scala:388)
        at org.scalatest.tools.Runner$.doRunRunRunDaDoRunRun(Runner.scala:1340)
        at 
org.scalatest.tools.Runner$.$anonfun$runOptionallyWithPassFailReporter$24(Runner.scala:1031)
        at 
org.scalatest.tools.Runner$.$anonfun$runOptionallyWithPassFailReporter$24$adapted(Runner.scala:1010)
        at 
org.scalatest.tools.Runner$$$Lambda$78.000000001B0D5820.apply(Unknown Source)
        at 
org.scalatest.tools.Runner$.withClassLoaderAndDispatchReporter(Runner.scala:1506)
        at 
org.scalatest.tools.Runner$.runOptionallyWithPassFailReporter(Runner.scala:1010)
        at org.scalatest.tools.Runner$.run(Runner.scala:850)
        at org.scalatest.tools.Runner.run(Runner.scala)
        at 
org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner.runScalaTest2(ScalaTestRunner.java:133)
        at 
org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner.main(ScalaTestRunner.java:27)
{code}

I believe the root cause for this bugs is in [this 
line|https://github.com/apache/arrow/blob/apache-arrow-0.15.0/java/vector/src/main/java/org/apache/arrow/vector/BaseVariableWidthVector.java#L1237]
 in the handleSafe method:

{code:java}
final int startOffset = getStartOffset(index);
{code}

we've encountered this bug in dremio's HashJoinOperator, where a loop has two 
cases: in one case it appends to one vector and in the other case it appends to 
another, when there are 'holes' in this loop it ends up calling copySafe with 
an index which is several slots away from the last update, in most cases this 
goes well but it occasionally (quite rare, but happens) misses the need to 
resize the values buffer.

will you be willing to accept a pull request fixing this issue?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to