Very sorry, I deleted an important part of the code. it should read: // Number of bytes each element takes up Int is 4, etc.. int itemSize = new AsDtype(dtype).memory_itemsize(); int countBufferSize = (entryStop - entryStart + 1) * INT_SIZE;
ArrowBuf countsBuf = allocator.buffer(countBufferSize); for (int x = 0; x < (entryStop - entryStart + 1); x++) { countsBuf.setInt(x * INT_SIZE, x * desc.getFixedLength()); } // File format uses BE, so perform a byte swap to get to LE ArrowBuf contentBuf = swapEndianness(contentTemp); ArrowType outerType = new ArrowType.List(); // Convert from our internal dtype to the Arrow equivalent ArrowType innerType = dtypeToArrow(); FieldType outerField = new FieldType(false, outerType, null); FieldType innerField = new FieldType(false, innerType, null); int outerLen = (entryStop - entryStart) * contentTemp.multiplicity(); int innerLen = contentTemp.numitems(); ArrowFieldNode outerNode = new ArrowFieldNode(outerLen, 0); ArrowFieldNode innerNode = new ArrowFieldNode(innerLen, 0); ListVector arrowVec = ListVector.empty("testcol", allocator); arrowVec.loadFieldBuffers(outerNode, Arrays.asList(null, countsBuf)); AddOrGetResult<ValueVector> children = arrowVec.addOrGetVector(innerField); FieldVector innerVec = (FieldVector) children.getVector(); innerVec.loadFieldBuffers(innerNode, Arrays.asList(null, contentBuf)); On Thu, Apr 27, 2023 at 2:20 PM Andrew Melo <andrew.m...@gmail.com> wrote: > > Hi all, > > I am working on a Spark datasource plugin that reads a (custom) file > format and outputs arrow-backed columns. I'm having difficulty > figuring out how to construct a ListVector if I have an ArrowBuf with > the contents and know the width of each list. I've tried constructing > the buffer with the code I pasted below, but it appears something > becomes unaligned, and I get incorrect values back when reading the > vector back. > > The documentation and elsewhere on the internet has examples where you > construct the ListVector element-by-element (e.g. with > UnionListWriter), but I'm having difficulty finding an example where > you start from ArrowBufs and use that to directly construct the > ListVector. > > Does anyone have an example they could point me to? > > Thanks! > Andrew > > // Number of bytes each element takes up Int is 4, etc.. > int itemSize = new AsDtype(dtype).memory_itemsize(); > int countBufferSize = (entryStop - entryStart + 1) * INT_SIZE; > > ArrowBuf countsBuf = allocator.buffer(countBufferSize); > // File format uses BE, so perform a byte swap to get to LE > ArrowBuf contentBuf = swapEndianness(contentTemp); > > ArrowType outerType = new ArrowType.List(); > // Convert from our internal dtype to the Arrow equivalent > ArrowType innerType = dtypeToArrow(); > > FieldType outerField = new FieldType(false, outerType, null); > FieldType innerField = new FieldType(false, innerType, null); > > int outerLen = (entryStop - entryStart) * contentTemp.multiplicity(); > int innerLen = contentTemp.numitems(); > ArrowFieldNode outerNode = new ArrowFieldNode(outerLen, 0); > ArrowFieldNode innerNode = new ArrowFieldNode(innerLen, 0); > > ListVector arrowVec = ListVector.empty("testcol", allocator); > arrowVec.loadFieldBuffers(outerNode, Arrays.asList(null, countsBuf)); > > AddOrGetResult<ValueVector> children = > arrowVec.addOrGetVector(innerField); > > FieldVector innerVec = (FieldVector) children.getVector(); > innerVec.loadFieldBuffers(innerNode, Arrays.asList(null, contentBuf));