Denis Gursky created ARROW-18007: ------------------------------------ Summary: [JS] Values returned as undefined when arrow file bigger than 2gb Key: ARROW-18007 URL: https://issues.apache.org/jira/browse/ARROW-18007 Project: Apache Arrow Issue Type: Bug Components: JavaScript Reporter: Denis Gursky
Steps: 1. Generate arrow file bigger than 2gb {code:java} import pyarrow as pa nums1 = [42] nums2 = [42.42] mil = 1000000 for n in range(1, 140 * mil): nums1.append(n) nums2.append(1 / n) arr1 = pa.array(nums1) arr2 = pa.array(nums2) schema = pa.schema([ pa.field('nums1', arr1.type), pa.field('nums2', arr2.type), ]) with pa.OSFile('arraydata.arrow', 'wb') as sink: with pa.ipc.new_file(sink, schema=schema) as writer: batch = pa.record_batch([arr1, arr2], schema=schema) writer.write(batch) {code} 2. Try to read it via the JS SDK {code:java} const fs = require("fs"); const { tableFromIPC, RecordBatchReader } = require("apache-arrow"); const filePath = "./arraydata.arrow"; const stream = fs.createReadStream(filePath); const reader = RecordBatchReader.from(stream); (async function () { const table = await tableFromIPC(reader); console.log("numRows", table.numRows); console.log("first row", table.get(0).toArray()); })(); {code} The code above prints: {code:java} numRows 140000000 first row [ undefined, undefined ] {code} {{numRows}} is correct, but the values are coming out as {{{}undefined{}}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)