[jira] [Commented] (ARROW-11347) [JavaScript] Consider Objects instead of Maps

2021-02-04 Thread Paul Taylor (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17279348#comment-17279348
 ] 

Paul Taylor commented on ARROW-11347:
-

[~domoritz] see my comment here: 
https://issues.apache.org/jira/browse/ARROW-11351?focusedCommentId=17279344=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17279344

tl;dr; the Row API doesn't use JS's Map, the abstract Row base class just 
implements the Map interface. The actual lookup is delegated to its concrete 
subclass implementations StructRow and MapRow. StructRow still uses the 
flyweight, and MapRow attempts a similar optimization via Proxies if available.

> [JavaScript] Consider Objects instead of Maps
> -
>
> Key: ARROW-11347
> URL: https://issues.apache.org/jira/browse/ARROW-11347
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: JavaScript
>Reporter: Dominik Moritz
>Priority: Major
>  Labels: performance
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> A quick experiment 
> (https://observablehq.com/@domoritz/performance-of-maps-vs-objects) seems to 
> show that object accesses are a lot faster than map accesses. Would it make 
> sense to switch to objects in the row API to improve performance? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-11347) [JavaScript] Consider Objects instead of Maps

2021-01-24 Thread Dominik Moritz (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17270855#comment-17270855
 ] 

Dominik Moritz commented on ARROW-11347:


Oh awesome. Maybe this issue is not an issue then. We can close it when we hear 
back from [~paultay...@symularity.com]. 

> [JavaScript] Consider Objects instead of Maps
> -
>
> Key: ARROW-11347
> URL: https://issues.apache.org/jira/browse/ARROW-11347
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: JavaScript
>Reporter: Dominik Moritz
>Priority: Major
>  Labels: performance
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> A quick experiment 
> (https://observablehq.com/@domoritz/performance-of-maps-vs-objects) seems to 
> show that object accesses are a lot faster than map accesses. Would it make 
> sense to switch to objects in the row API to improve performance? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-11347) [JavaScript] Consider Objects instead of Maps

2021-01-22 Thread Brian Hulette (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17270480#comment-17270480
 ] 

Brian Hulette commented on ARROW-11347:
---

Ah, you mean when accessing a Row, e.g. table.get(0)

I _think_ the choice of Map was for code-reuse between Struct vectors and Map 
vectors ([~paul.e.taylor] wrote this, he could comment more certainly). Note I 
also added the ability to access the fields in a row view "by attribute" in 
Python parlance in https://github.com/apache/arrow/pull/2197. So if you have a 
table with a "foo" field you can access it in a Row view with either 
table.get(0)["foo"] or table.get(0).foo. I'm pretty sure I actually added that 
in response to a perf measurement from Jeff back in 2018.

> [JavaScript] Consider Objects instead of Maps
> -
>
> Key: ARROW-11347
> URL: https://issues.apache.org/jira/browse/ARROW-11347
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: JavaScript
>Reporter: Dominik Moritz
>Priority: Major
>  Labels: performance
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> A quick experiment 
> (https://observablehq.com/@domoritz/performance-of-maps-vs-objects) seems to 
> show that object accesses are a lot faster than map accesses. Would it make 
> sense to switch to objects in the row API to improve performance? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-11347) [JavaScript] Consider Objects instead of Maps

2021-01-22 Thread Dominik Moritz (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17270450#comment-17270450
 ] 

Dominik Moritz commented on ARROW-11347:


Yes, when accessing an element of an array (e.g. after `toArray()`). 

Before making the change, someone needs to look closer into the performance 
benefits and usability. Jeff created his own parser for Arquero, which can make 
some simplifying assumptions and is less general but also almost twice as fast 
(https://github.com/uwdata/arquero-arrow/tree/main/perf). It would be good to 
figure out why. I think Maps are generally nicer than Objects for users so 
maybe it's worth the performance difference. It would be great if you could 
share how you decided on Maps in the first place. 

> [JavaScript] Consider Objects instead of Maps
> -
>
> Key: ARROW-11347
> URL: https://issues.apache.org/jira/browse/ARROW-11347
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: JavaScript
>Reporter: Dominik Moritz
>Priority: Major
>  Labels: performance
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> A quick experiment 
> (https://observablehq.com/@domoritz/performance-of-maps-vs-objects) seems to 
> show that object accesses are a lot faster than map accesses. Would it make 
> sense to switch to objects in the row API to improve performance? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-11347) [JavaScript] Consider Objects instead of Maps

2021-01-22 Thread Brian Hulette (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17270257#comment-17270257
 ] 

Brian Hulette commented on ARROW-11347:
---

Can you clarify where is it that we use Maps that you think should change? Is 
it when accessing an element of a Map-typed array?

I'd be open to changing it but we'd need to consider that this would be a 
breaking API change. I suppose this is technically OK since all releases are 
major but it may be inconvenient for users.

> [JavaScript] Consider Objects instead of Maps
> -
>
> Key: ARROW-11347
> URL: https://issues.apache.org/jira/browse/ARROW-11347
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: JavaScript
>Reporter: Dominik Moritz
>Priority: Major
>  Labels: performance
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> A quick experiment 
> (https://observablehq.com/@domoritz/performance-of-maps-vs-objects) seems to 
> show that object accesses are a lot faster than map accesses. Would it make 
> sense to switch to objects in the row API to improve performance? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-11347) [JavaScript] Consider Objects instead of Maps

2021-01-22 Thread Dominik Moritz (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17270206#comment-17270206
 ] 

Dominik Moritz commented on ARROW-11347:


I wonder what [~bhulette] and [~paultaylor] say about this since they 
originally decided to go with Map. 

> [JavaScript] Consider Objects instead of Maps
> -
>
> Key: ARROW-11347
> URL: https://issues.apache.org/jira/browse/ARROW-11347
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: JavaScript
>Reporter: Dominik Moritz
>Priority: Major
>  Labels: performance
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> A quick experiment 
> (https://observablehq.com/@domoritz/performance-of-maps-vs-objects) seems to 
> show that object accesses are a lot faster than map accesses. Would it make 
> sense to switch to objects in the row API to improve performance? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-11347) [JavaScript] Consider Objects instead of Maps

2021-01-22 Thread Neville Dipale (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17270182#comment-17270182
 ] 

Neville Dipale commented on ARROW-11347:


Hi [~domoritz]

The performance difference looks solid. Do you think that there'd be a downside 
to using Object, in the ergonomics of the APIs?

I haven't used the JS implementation enough to have an opinion, hence I'm 
asking.

If you can open a PR with the change, we can review it and get it merged.

Thanks

> [JavaScript] Consider Objects instead of Maps
> -
>
> Key: ARROW-11347
> URL: https://issues.apache.org/jira/browse/ARROW-11347
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: JavaScript
>Reporter: Dominik Moritz
>Priority: Major
>  Labels: performance
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> A quick experiment 
> (https://observablehq.com/@domoritz/performance-of-maps-vs-objects) seems to 
> show that object accesses are a lot faster than map accesses. Would it make 
> sense to switch to objects in the row API to improve performance? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)