[jira] [Commented] (ARROW-11347) [JavaScript] Consider Objects instead of Maps
[ https://issues.apache.org/jira/browse/ARROW-11347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17279348#comment-17279348 ] Paul Taylor commented on ARROW-11347: - [~domoritz] see my comment here: https://issues.apache.org/jira/browse/ARROW-11351?focusedCommentId=17279344=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17279344 tl;dr; the Row API doesn't use JS's Map, the abstract Row base class just implements the Map interface. The actual lookup is delegated to its concrete subclass implementations StructRow and MapRow. StructRow still uses the flyweight, and MapRow attempts a similar optimization via Proxies if available. > [JavaScript] Consider Objects instead of Maps > - > > Key: ARROW-11347 > URL: https://issues.apache.org/jira/browse/ARROW-11347 > Project: Apache Arrow > Issue Type: Improvement > Components: JavaScript >Reporter: Dominik Moritz >Priority: Major > Labels: performance > Original Estimate: 24h > Remaining Estimate: 24h > > A quick experiment > (https://observablehq.com/@domoritz/performance-of-maps-vs-objects) seems to > show that object accesses are a lot faster than map accesses. Would it make > sense to switch to objects in the row API to improve performance? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-11347) [JavaScript] Consider Objects instead of Maps
[ https://issues.apache.org/jira/browse/ARROW-11347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17270855#comment-17270855 ] Dominik Moritz commented on ARROW-11347: Oh awesome. Maybe this issue is not an issue then. We can close it when we hear back from [~paultay...@symularity.com]. > [JavaScript] Consider Objects instead of Maps > - > > Key: ARROW-11347 > URL: https://issues.apache.org/jira/browse/ARROW-11347 > Project: Apache Arrow > Issue Type: Improvement > Components: JavaScript >Reporter: Dominik Moritz >Priority: Major > Labels: performance > Original Estimate: 24h > Remaining Estimate: 24h > > A quick experiment > (https://observablehq.com/@domoritz/performance-of-maps-vs-objects) seems to > show that object accesses are a lot faster than map accesses. Would it make > sense to switch to objects in the row API to improve performance? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-11347) [JavaScript] Consider Objects instead of Maps
[ https://issues.apache.org/jira/browse/ARROW-11347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17270480#comment-17270480 ] Brian Hulette commented on ARROW-11347: --- Ah, you mean when accessing a Row, e.g. table.get(0) I _think_ the choice of Map was for code-reuse between Struct vectors and Map vectors ([~paul.e.taylor] wrote this, he could comment more certainly). Note I also added the ability to access the fields in a row view "by attribute" in Python parlance in https://github.com/apache/arrow/pull/2197. So if you have a table with a "foo" field you can access it in a Row view with either table.get(0)["foo"] or table.get(0).foo. I'm pretty sure I actually added that in response to a perf measurement from Jeff back in 2018. > [JavaScript] Consider Objects instead of Maps > - > > Key: ARROW-11347 > URL: https://issues.apache.org/jira/browse/ARROW-11347 > Project: Apache Arrow > Issue Type: Improvement > Components: JavaScript >Reporter: Dominik Moritz >Priority: Major > Labels: performance > Original Estimate: 24h > Remaining Estimate: 24h > > A quick experiment > (https://observablehq.com/@domoritz/performance-of-maps-vs-objects) seems to > show that object accesses are a lot faster than map accesses. Would it make > sense to switch to objects in the row API to improve performance? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-11347) [JavaScript] Consider Objects instead of Maps
[ https://issues.apache.org/jira/browse/ARROW-11347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17270450#comment-17270450 ] Dominik Moritz commented on ARROW-11347: Yes, when accessing an element of an array (e.g. after `toArray()`). Before making the change, someone needs to look closer into the performance benefits and usability. Jeff created his own parser for Arquero, which can make some simplifying assumptions and is less general but also almost twice as fast (https://github.com/uwdata/arquero-arrow/tree/main/perf). It would be good to figure out why. I think Maps are generally nicer than Objects for users so maybe it's worth the performance difference. It would be great if you could share how you decided on Maps in the first place. > [JavaScript] Consider Objects instead of Maps > - > > Key: ARROW-11347 > URL: https://issues.apache.org/jira/browse/ARROW-11347 > Project: Apache Arrow > Issue Type: Improvement > Components: JavaScript >Reporter: Dominik Moritz >Priority: Major > Labels: performance > Original Estimate: 24h > Remaining Estimate: 24h > > A quick experiment > (https://observablehq.com/@domoritz/performance-of-maps-vs-objects) seems to > show that object accesses are a lot faster than map accesses. Would it make > sense to switch to objects in the row API to improve performance? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-11347) [JavaScript] Consider Objects instead of Maps
[ https://issues.apache.org/jira/browse/ARROW-11347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17270257#comment-17270257 ] Brian Hulette commented on ARROW-11347: --- Can you clarify where is it that we use Maps that you think should change? Is it when accessing an element of a Map-typed array? I'd be open to changing it but we'd need to consider that this would be a breaking API change. I suppose this is technically OK since all releases are major but it may be inconvenient for users. > [JavaScript] Consider Objects instead of Maps > - > > Key: ARROW-11347 > URL: https://issues.apache.org/jira/browse/ARROW-11347 > Project: Apache Arrow > Issue Type: Improvement > Components: JavaScript >Reporter: Dominik Moritz >Priority: Major > Labels: performance > Original Estimate: 24h > Remaining Estimate: 24h > > A quick experiment > (https://observablehq.com/@domoritz/performance-of-maps-vs-objects) seems to > show that object accesses are a lot faster than map accesses. Would it make > sense to switch to objects in the row API to improve performance? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-11347) [JavaScript] Consider Objects instead of Maps
[ https://issues.apache.org/jira/browse/ARROW-11347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17270206#comment-17270206 ] Dominik Moritz commented on ARROW-11347: I wonder what [~bhulette] and [~paultaylor] say about this since they originally decided to go with Map. > [JavaScript] Consider Objects instead of Maps > - > > Key: ARROW-11347 > URL: https://issues.apache.org/jira/browse/ARROW-11347 > Project: Apache Arrow > Issue Type: Improvement > Components: JavaScript >Reporter: Dominik Moritz >Priority: Major > Labels: performance > Original Estimate: 24h > Remaining Estimate: 24h > > A quick experiment > (https://observablehq.com/@domoritz/performance-of-maps-vs-objects) seems to > show that object accesses are a lot faster than map accesses. Would it make > sense to switch to objects in the row API to improve performance? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-11347) [JavaScript] Consider Objects instead of Maps
[ https://issues.apache.org/jira/browse/ARROW-11347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17270182#comment-17270182 ] Neville Dipale commented on ARROW-11347: Hi [~domoritz] The performance difference looks solid. Do you think that there'd be a downside to using Object, in the ergonomics of the APIs? I haven't used the JS implementation enough to have an opinion, hence I'm asking. If you can open a PR with the change, we can review it and get it merged. Thanks > [JavaScript] Consider Objects instead of Maps > - > > Key: ARROW-11347 > URL: https://issues.apache.org/jira/browse/ARROW-11347 > Project: Apache Arrow > Issue Type: Improvement > Components: JavaScript >Reporter: Dominik Moritz >Priority: Major > Labels: performance > Original Estimate: 24h > Remaining Estimate: 24h > > A quick experiment > (https://observablehq.com/@domoritz/performance-of-maps-vs-objects) seems to > show that object accesses are a lot faster than map accesses. Would it make > sense to switch to objects in the row API to improve performance? -- This message was sent by Atlassian Jira (v8.3.4#803005)