Hi:
  I encounter 'java.lang.NegativeArraySizeException' error with carbondata
1.3.1 + spark 2.2.
  When I run the compact command to compact 8 level-1 segments to a level-2
segment, the 'java.lang.NegativeArraySizeException' error occurred:
*java.lang.NegativeArraySizeException
        at
org.apache.carbondata.core.datastore.chunk.store.impl.unsafe.UnsafeVariableLengthDimesionDataChunkStore.getRow(UnsafeVariableLengthDimesionDataChunkStore.java:172)
        at
org.apache.carbondata.core.datastore.chunk.impl.AbstractDimensionDataChunk.getChunkData(AbstractDimensionDataChunk.java:46)
        at
org.apache.carbondata.core.scan.result.AbstractScannedResult.getNoDictionaryKeyArray(AbstractScannedResult.java:431)
        at
org.apache.carbondata.core.scan.result.impl.NonFilterQueryScannedResult.getNoDictionaryKeyArray(NonFilterQueryScannedResult.java:67)
        at
org.apache.carbondata.core.scan.collector.impl.RawBasedResultCollector.scanResultAndGetData(RawBasedResultCollector.java:83)
        at
org.apache.carbondata.core.scan.collector.impl.RawBasedResultCollector.collectData(RawBasedResultCollector.java:58)
        at
org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.next(DataBlockIteratorImpl.java:51)
        at
org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.next(DataBlockIteratorImpl.java:32)
        at
org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.getBatchResult(DetailQueryResultIterator.java:49)
        at
org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:41)
        at
org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:31)
        at
org.apache.carbondata.core.scan.result.iterator.RawResultIterator.hasNext(RawResultIterator.java:72)
        at
org.apache.carbondata.processing.merger.RowResultMergerProcessor.execute(RowResultMergerProcessor.java:131)
        at
org.apache.carbondata.spark.rdd.CarbonMergerRDD$$anon$1.<init>(CarbonMergerRDD.scala:228)
        at
org.apache.carbondata.spark.rdd.CarbonMergerRDD.internalCompute(CarbonMergerRDD.scala:84)
        at
org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:60)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
        at
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
        at org.apache.spark.scheduler.Task.run(Task.scala:109)
        at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)*

I traced the code of 'UnsafeVariableLengthDimesionDataChunkStore.getRow',
found that the root cause is the value of length is negative when create
byte array: 'byte[] data = new byte[length];', the value of some parameters
are below when error ocurred:

when 'rowId < numberOfRows - 1':
*this.dataLength=192000
currentDataOffset=2
rowId=0
OffsetOfNextdata=-12173  (why)
length=-12177*

otherwise :

*this.dataLength=320000
currentDataOffset=263702
rowId=31999
length=-9238*

the value of (320000 - 263702) is exceed the range of short.

I patch the PR#2796(https://github.com/apache/carbondata/pull/2796), but
error still occurred.

finally, my test steps are:

for example: there are 4 level-1 compacted segments: 1.1, 2.1, 3.1, 4.1:
*1. run compact command, it failed;
2. delete 1.1 segment, run compact command again, it failed;
3. delete 2.1 segment, run compact command again, it failed;
3. delete 3.1 segment, run compact command again, it succeeded;*

So I think that one of 8 level-1 compacted segments maybe have some problem
but I don't how to find out.



--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Reply via email to