Hi: I encounter 'java.lang.NegativeArraySizeException' error with carbondata 1.3.1 + spark 2.2. When I run the compact command to compact 8 level-1 segments to a level-2 segment, the 'java.lang.NegativeArraySizeException' error occurred: *java.lang.NegativeArraySizeException at org.apache.carbondata.core.datastore.chunk.store.impl.unsafe.UnsafeVariableLengthDimesionDataChunkStore.getRow(UnsafeVariableLengthDimesionDataChunkStore.java:172) at org.apache.carbondata.core.datastore.chunk.impl.AbstractDimensionDataChunk.getChunkData(AbstractDimensionDataChunk.java:46) at org.apache.carbondata.core.scan.result.AbstractScannedResult.getNoDictionaryKeyArray(AbstractScannedResult.java:431) at org.apache.carbondata.core.scan.result.impl.NonFilterQueryScannedResult.getNoDictionaryKeyArray(NonFilterQueryScannedResult.java:67) at org.apache.carbondata.core.scan.collector.impl.RawBasedResultCollector.scanResultAndGetData(RawBasedResultCollector.java:83) at org.apache.carbondata.core.scan.collector.impl.RawBasedResultCollector.collectData(RawBasedResultCollector.java:58) at org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.next(DataBlockIteratorImpl.java:51) at org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.next(DataBlockIteratorImpl.java:32) at org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.getBatchResult(DetailQueryResultIterator.java:49) at org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:41) at org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:31) at org.apache.carbondata.core.scan.result.iterator.RawResultIterator.hasNext(RawResultIterator.java:72) at org.apache.carbondata.processing.merger.RowResultMergerProcessor.execute(RowResultMergerProcessor.java:131) at org.apache.carbondata.spark.rdd.CarbonMergerRDD$$anon$1.<init>(CarbonMergerRDD.scala:228) at org.apache.carbondata.spark.rdd.CarbonMergerRDD.internalCompute(CarbonMergerRDD.scala:84) at org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:109) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)*
I traced the code of 'UnsafeVariableLengthDimesionDataChunkStore.getRow', found that the root cause is the value of length is negative when create byte array: 'byte[] data = new byte[length];', the value of some parameters are below when error ocurred: when 'rowId < numberOfRows - 1': *this.dataLength=192000 currentDataOffset=2 rowId=0 OffsetOfNextdata=-12173 (why) length=-12177* otherwise : *this.dataLength=320000 currentDataOffset=263702 rowId=31999 length=-9238* the value of (320000 - 263702) is exceed the range of short. I patch the PR#2796(https://github.com/apache/carbondata/pull/2796), but error still occurred. finally, my test steps are: for example: there are 4 level-1 compacted segments: 1.1, 2.1, 3.1, 4.1: *1. run compact command, it failed; 2. delete 1.1 segment, run compact command again, it failed; 3. delete 2.1 segment, run compact command again, it failed; 3. delete 3.1 segment, run compact command again, it succeeded;* So I think that one of 8 level-1 compacted segments maybe have some problem but I don't how to find out. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/