joellubi commented on code in PR #43234:
URL: https://github.com/apache/arrow/pull/43234#discussion_r1680886545
##########
docs/source/format/CanonicalExtensions.rst:
##########
@@ -283,6 +283,28 @@ UUID
A specific UUID version is not required or guaranteed. This extension
represents
UUIDs as FixedSizeBinary(16) with big-endian notation and does not
interpret the bytes in any way.
+8-bit Boolean
+====
+
+Bool8 represents a boolean value using 1 byte (8 bits) to store each value
instead of only 1 bit as in
+the native Arrow Boolean type. Although less compact that the native
representation, Bool8 may have
+better zero-copy compatibility with various systems that also store booleans
using 1 byte.
+
+* Extension name: ``arrow.bool8``.
+
+* The storage type of this extension is ``Int8`` where:
+
+ * **false** is denoted by the value ``0``.
+ * **true** can be specified using any non-zero value.
Review Comment:
Thanks @felipecrv. Would these optimizations work if `1` is _preferred_ for
`true` but any nonzero value is still considered valid? Perhaps there are some
"fastpath" optimizations that can be done by checking the first bit (LE) first
to see if the value is `1`, but any generic implementation can't use the
absence of this bit to mean `false`. It will need to check all other bits as
well. Let me know if I'm understanding your suggestion correctly.
> Producers MUST produce 0 or 1 values. Consumers SHOULD treat any non-zero
value as true and 0 as false.
Can you clarify or give some examples of producers and consumers in this
case? Unless I'm misunderstanding, if producers MUST produce 0 or 1 values then
it's not clear how a consumer would ever receive any other value.
So I would then expect the inverse:
Producers **SHOULD** produce 0 or 1 values. Consumers **MUST** treat any
non-zero value as true and 0 as false.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]