Wes McKinney created ARROW-6359:
-----------------------------------
Summary: [C++] Raw data equality in arrays vs. semantic value
equality
Key: ARROW-6359
URL: https://issues.apache.org/jira/browse/ARROW-6359
Project: Apache Arrow
Issue Type: Improvement
Components: C++
Reporter: Wes McKinney
I have observed a conflict in requirements / expectations in our {{Equals}}
functions. The initial implementations of these functions would compare the raw
bytes found in non-null data slots, in addition to the validity bitmaps in each
array, and their respective children, taken slice offsets and so forth into
account.
Recently we have been adding type-specific value comparison semantics to these
comparisons, notably propagating {{NaN != NaN}}. This has led to such issues as
ARROW-6043.
Rather than creating "one true way" to compare array contents, I would suggest
introducing functions that perform slightly different comparisons:
* Raw data comparison, skipping masked null values
* Raw data comparison, comparing all buffer contents (up to the semantic
"extent" of an array -- so ignoring the contents of padding, or excess buffer
contents when dealing with slices)
thoughts?
--
This message was sent by Atlassian Jira
(v8.3.2#803003)