[ https://issues.apache.org/jira/browse/ARROW-5949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091547#comment-17091547 ]
Neville Dipale commented on ARROW-5949: --------------------------------------- Hi [~vertexclique], there was some discussion around using sentinel values over bitmask ([https://github.com/apache/arrow/pull/6095#discussion_r367760573),] and I believe it was a matter of sentinel values not being spec-compliant. We never resolved the following point, but I was of the opinion that it'd be better to provide methods/functions that allow converting a dictionary array into a primitive array. My opinion was mainly informed by my concern that we don't have a way of using dictionary arrays in compute kernels, so at the time I preferred something to convert `dict(i32)[<k:[0, 0, null, 1, 2]><v:[1, 2, null]>` to `i32<1, 1, null, 2, null>`. The contributor of the PR provided a valid use-case, which led them in the route of providing iterator access, so we eventually merged the PR under the premise that more work could be done in future to provide other access methods. Regarding the 2 reasons: R1: what do you mean by "rebuilding from that lookup"? Do you mean rebuilding a primitive array from the dictionary's iterator? If so, would a method that converts a dict(i32) into a primitive(i32) suffice for your needs? R2: may you please provide an example of what you mean by parallel comparison? My knowledge of SIMD and auto-vec is a bit limited, but what we noticed in the Rust implementation is that we can often forgo explicit SIMD on some computation kernels if we relegate null handling to bitmask manipulation, and operate on arrays without branching to check nulls ([https://github.com/apache/arrow/pull/6086]). > [Rust] Implement DictionaryArray > -------------------------------- > > Key: ARROW-5949 > URL: https://issues.apache.org/jira/browse/ARROW-5949 > Project: Apache Arrow > Issue Type: New Feature > Components: Rust > Reporter: David Atienza > Assignee: David Atienza > Priority: Major > Labels: pull-request-available > Fix For: 0.17.0 > > Time Spent: 18h > Remaining Estimate: 0h > > I am pretty new to the codebase, but I have seen that DictionaryArray is not > implemented in the Rust implementation. > I went through the list of issues and I could not see any work on this. Is > there any blocker? > > The specification is a bit > [short|https://arrow.apache.org/docs/format/Layout.html#dictionary-encoding] > or even > [non-existant|https://arrow.apache.org/docs/format/Metadata.html#dictionary-encoding], > so I am not sure how to implement it myself. -- This message was sent by Atlassian Jira (v8.3.4#803005)