jorisvandenbossche commented on pull request #8271:
URL: https://github.com/apache/arrow/pull/8271#issuecomment-701380049
> This also raised a question: Should both the list entry and the string
array entry have a missing/null value if the input string contains a null
value? I think it should because if we ask for the underlying string array, and
the string value that the missing list entry points to is not a missing value,
it will be an empty string, which seems odd to me.
Buf if the original string was a null, then the output also contains a
(top-level) null, and not a list with a null?
And then such top-level nulls are typically not put as an entry in the
actual values array (or at least when arrow is building up a list array itself,
the format might also not allow otherwise). Eg:
```
In [10]: arr = pa.array([["a", "b"], None, ["c"], [None]])
In [11]: arr
Out[11]:
<pyarrow.lib.ListArray object at 0x7f2316c1d108>
[
[
"a",
"b"
],
null,
[
"c"
],
[
null
]
]
In [12]: arr.values
Out[12]:
<pyarrow.lib.StringArray object at 0x7f2316c1d5e8>
[
"a",
"b",
"c",
null
]
```
(so only the null that was inside a list is represented as null in the
underlying values array)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]