hnasrullakhan commented on issue #44410:
URL: https://github.com/apache/arrow/issues/44410#issuecomment-2414649373

   
   
   
   @vibhatha  Thanks
   
   ```
    List<FieldVector> fieldVectors = vectorSchemaRoot.getFieldVectors();
               for (FieldVector fieldVector : fieldVectors) {
   //                System.out.println("fieldVector value column: "+ 
fieldVector.getValueCount());
                   int fieldVectorValueCount = fieldVector.getValueCount();
                   System.out.println("Before");
                   System.out.println(fieldVector);
                   Field field = fieldVector.getField();
                   final ArrowBuf dataBuffer = fieldVector.getDataBuffer();
                   final ArrowBuf validityBuffer = 
fieldVector.getValidityBuffer();
                   final ArrowBuf offsetBuffer = fieldVector.getOffsetBuffer();
   
                   if (field.getType().getTypeID() == 
org.apache.gluten.shaded.org.apache.arrow.vector.types.pojo.ArrowType.Utf8.TYPE_TYPE)
 {
   //                    System.out.println("This is string column id"+ column);
   
                       // Compress each buffer associated with the string column
                       int maxCompressedLength = 
compressor.maxCompressedLength((int)dataBuffer.capacity());
                       byte[] compressedBytes = new byte[maxCompressedLength];
                       ByteBuffer originalBuffer = dataBuffer.nioBuffer(0, 
(int) dataBuffer.readableBytes());
                       byte[] originalBytes = new 
byte[originalBuffer.remaining()];
                       System.out.println("originalbytes: "+ 
dataBuffer.readableBytes());
   
                       int compressedLength = 
compressor.compress(originalBytes, 0, originalBytes.length, compressedBytes, 0, 
maxCompressedLength);
   
                       ArrowBuf compressedBuf = 
allocator.buffer(compressedLength);
                       compressedBuf.writeBytes(compressedBytes, 0, 
compressedLength);
                       compressedBuf.writerIndex(dataBuffer.writerIndex());
                       compressedBuf.readerIndex(0);
                       ArrowFieldNode fieldNode = new 
ArrowFieldNode(fieldVector.getValueCount(), fieldVector.getNullCount());
                       fieldVector.loadFieldBuffers(fieldNode, 
List.of(validityBuffer, offsetBuffer, compressedBuf));
                       final ArrowBuf newDataBuffer = 
fieldVector.getDataBuffer();
                       vectorSchemaRoot.setRowCount(rowCount);
                       fieldVector.setValueCount(fieldVectorValueCount);
                       long newDataBufferBytes = newDataBuffer.readableBytes();
                       System.out.println();
                       System.out.println("After");
                       System.out.println("compressedBytes: "+ 
newDataBufferBytes);
                       System.out.println(fieldVector); 
                   }
     ```
   
   I tried this and loadFieldBuffers doesnt load newdata
   
   ```
   Before
   [CMT, CMT, VTS, CMT, VTS, CMT, VTS, CMT, VTS, CMT, ... CMT, CMT, CMT, CMT, 
CMT, CMT, CMT, CMT, CMT, CMT]
   originalbytes: 12288
   
   After
   compressedBytes: 12288
   ```
   
   last ```    System.out.println(fieldVector); ``` fails with error below
   
   ```
   java.lang.IndexOutOfBoundsException: index: 12258, length: 3 (expected: 
range(0, 64))
        at 
org.apache.gluten.shaded.org.apache.arrow.memory.ArrowBuf.checkIndex(ArrowBuf.java:701)
        at 
org.apache.gluten.shaded.org.apache.arrow.memory.ArrowBuf.getBytes(ArrowBuf.java:728)
        at 
org.apache.gluten.shaded.org.apache.arrow.vector.util.ReusableByteArray.set(ReusableByteArray.java:63)
        at 
org.apache.gluten.shaded.org.apache.arrow.vector.VarCharVector.read(VarCharVector.java:142)
        at 
org.apache.gluten.shaded.org.apache.arrow.vector.VarCharVector.getObject(VarCharVector.java:128)
        at 
org.apache.gluten.shaded.org.apache.arrow.vector.VarCharVector.getObject(VarCharVector.java:40)
        at 
org.apache.gluten.shaded.org.apache.arrow.vector.util.ValueVectorUtility.lambda$getToString$0(ValueVectorUtility.java:58)
        at 
org.apache.gluten.shaded.org.apache.arrow.vector.util.ValueVectorUtility.getToString(ValueVectorUtility.java:95)
        at 
org.apache.gluten.shaded.org.apache.arrow.vector.util.ValueVectorUtility.getToString(ValueVectorUtility.java:58)
        at 
org.apache.gluten.shaded.org.apache.arrow.vector.BaseValueVector.toString(BaseValueVector.java:68)
        at java.base/java.lang.String.valueOf(String.java:2951)
        at java.base/java.io.PrintStream.println(PrintStream.java:897)
        at 
org.apache.spark.microsoft.tools.logging.ConsoleLogPrintStream.println(ConsoleLogPrintStream.scala:141)
   ```
   
   
                 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to