hnasrullakhan commented on issue #44410:
URL: https://github.com/apache/arrow/issues/44410#issuecomment-2414649373
@vibhatha Thanks
```
List<FieldVector> fieldVectors = vectorSchemaRoot.getFieldVectors();
for (FieldVector fieldVector : fieldVectors) {
// System.out.println("fieldVector value column: "+
fieldVector.getValueCount());
int fieldVectorValueCount = fieldVector.getValueCount();
System.out.println("Before");
System.out.println(fieldVector);
Field field = fieldVector.getField();
final ArrowBuf dataBuffer = fieldVector.getDataBuffer();
final ArrowBuf validityBuffer =
fieldVector.getValidityBuffer();
final ArrowBuf offsetBuffer = fieldVector.getOffsetBuffer();
if (field.getType().getTypeID() ==
org.apache.gluten.shaded.org.apache.arrow.vector.types.pojo.ArrowType.Utf8.TYPE_TYPE)
{
// System.out.println("This is string column id"+ column);
// Compress each buffer associated with the string column
int maxCompressedLength =
compressor.maxCompressedLength((int)dataBuffer.capacity());
byte[] compressedBytes = new byte[maxCompressedLength];
ByteBuffer originalBuffer = dataBuffer.nioBuffer(0,
(int) dataBuffer.readableBytes());
byte[] originalBytes = new
byte[originalBuffer.remaining()];
System.out.println("originalbytes: "+
dataBuffer.readableBytes());
int compressedLength =
compressor.compress(originalBytes, 0, originalBytes.length, compressedBytes, 0,
maxCompressedLength);
ArrowBuf compressedBuf =
allocator.buffer(compressedLength);
compressedBuf.writeBytes(compressedBytes, 0,
compressedLength);
compressedBuf.writerIndex(dataBuffer.writerIndex());
compressedBuf.readerIndex(0);
ArrowFieldNode fieldNode = new
ArrowFieldNode(fieldVector.getValueCount(), fieldVector.getNullCount());
fieldVector.loadFieldBuffers(fieldNode,
List.of(validityBuffer, offsetBuffer, compressedBuf));
final ArrowBuf newDataBuffer =
fieldVector.getDataBuffer();
vectorSchemaRoot.setRowCount(rowCount);
fieldVector.setValueCount(fieldVectorValueCount);
long newDataBufferBytes = newDataBuffer.readableBytes();
System.out.println();
System.out.println("After");
System.out.println("compressedBytes: "+
newDataBufferBytes);
System.out.println(fieldVector);
}
```
I tried this and loadFieldBuffers doesnt load newdata
```
Before
[CMT, CMT, VTS, CMT, VTS, CMT, VTS, CMT, VTS, CMT, ... CMT, CMT, CMT, CMT,
CMT, CMT, CMT, CMT, CMT, CMT]
originalbytes: 12288
After
compressedBytes: 12288
```
last ``` System.out.println(fieldVector); ``` fails with error below
```
java.lang.IndexOutOfBoundsException: index: 12258, length: 3 (expected:
range(0, 64))
at
org.apache.gluten.shaded.org.apache.arrow.memory.ArrowBuf.checkIndex(ArrowBuf.java:701)
at
org.apache.gluten.shaded.org.apache.arrow.memory.ArrowBuf.getBytes(ArrowBuf.java:728)
at
org.apache.gluten.shaded.org.apache.arrow.vector.util.ReusableByteArray.set(ReusableByteArray.java:63)
at
org.apache.gluten.shaded.org.apache.arrow.vector.VarCharVector.read(VarCharVector.java:142)
at
org.apache.gluten.shaded.org.apache.arrow.vector.VarCharVector.getObject(VarCharVector.java:128)
at
org.apache.gluten.shaded.org.apache.arrow.vector.VarCharVector.getObject(VarCharVector.java:40)
at
org.apache.gluten.shaded.org.apache.arrow.vector.util.ValueVectorUtility.lambda$getToString$0(ValueVectorUtility.java:58)
at
org.apache.gluten.shaded.org.apache.arrow.vector.util.ValueVectorUtility.getToString(ValueVectorUtility.java:95)
at
org.apache.gluten.shaded.org.apache.arrow.vector.util.ValueVectorUtility.getToString(ValueVectorUtility.java:58)
at
org.apache.gluten.shaded.org.apache.arrow.vector.BaseValueVector.toString(BaseValueVector.java:68)
at java.base/java.lang.String.valueOf(String.java:2951)
at java.base/java.io.PrintStream.println(PrintStream.java:897)
at
org.apache.spark.microsoft.tools.logging.ConsoleLogPrintStream.println(ConsoleLogPrintStream.scala:141)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]