[ https://issues.apache.org/jira/browse/ARROW-15852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17515537#comment-17515537 ]
Paul Taylor commented on ARROW-15852: ------------------------------------- [~timhigins] Thanks for the report. In running your code, I discovered an oversight we made in the v7.0 refactor. That said, I think your {{indexOf()}} call is incorrect – {{indexOf()}} is the inverse of {{get()}} such that this should assert true: {{table.indexOf(table.get(0)) === 0}} In your case (looking up the index of a row), you want to pass the entire row contents to the {{table.indexOf()}} call like this: {code:javascript} const { tableFromArrays } = require('apache-arrow'); const t = tableFromArrays({ a: [0, 1, 2], b: ["foo", "bar", "baz"] }); console.log(t.getByteLength(0)); console.log(t.getByteLength(1)); console.log(t.indexOf({a: 0, b: "foo"})); console.log(t.indexOf({a: 1, b: "bar"})); console.log(t.indexOf({a: 2, b: "baz"})); {code} > [JS] Table getByteLength and indexOf don't work > ----------------------------------------------- > > Key: ARROW-15852 > URL: https://issues.apache.org/jira/browse/ARROW-15852 > Project: Apache Arrow > Issue Type: Bug > Components: JavaScript > Affects Versions: 7.0.0 > Reporter: Timothy Higinbottom > Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > The functions table.getByteLength() and table.indexOf() don't return the > correct values. > They are bound dynamically to the Table class, in a way I don't fully > understand, with the following code: > [https://github.com/apache/arrow/blob/1b796ec3f9caeb5e86e3348ba940bef8d95915c5/js/src/table.ts#L378-L390] > The other functions like that, get(), set(), and isValid() all seem to work. > However, getByteLength() and indexOf() return the placeholder/sentinel values > of 0 and -1 respectively that are defined in the no-op code here: > [https://github.com/apache/arrow/blob/1b796ec3f9caeb5e86e3348ba940bef8d95915c5/js/src/table.ts#L207-L221,] > which I assume is to generate the right type definitions, and thus > documentation. > It's fairly simple for a user to implement the right logic themselves (at > least for getByteLength) and it's a quick patch to define the functions > normally instead of on the prototype, e.g.: > > {code:java} > /** > * Get the size in bytes of an element by index. > * @param index The index at which to get the byteLength. > */ > // @ts-ignore > public getByteLength(index: number): number { return > this.data[index].byteLength; } > /** > * Get the size in bytes of a table. > */ > //@ts-ignore > public getByteLength(): number { > return this.data.map((batch) => batch.byteLength).reduce((sum, > newLength) => sum + newLength); > } {code} > I'd be happy to send this as a PR if that's an OK alternative to the way it's > currently implemented. > Here's a Github repo of a minimal reproduction of the issue in NodeJS: > [https://github.com/alexkreidler/apache-arrow-js-small-bug] > > And an observable notebook for in the browser (although I couldn't get ESM > working): [https://observablehq.com/@08027ecfa2b2f7bb/arrow-7-canary] > > Thanks to all for your work on Arrow! > -- This message was sent by Atlassian Jira (v8.20.1#820001)