eerhardt commented on a change in pull request #7032:
URL: https://github.com/apache/arrow/pull/7032#discussion_r415075754
##########
File path: csharp/src/Apache.Arrow/Arrays/ArrayData.cs
##########
@@ -84,7 +84,7 @@ public ArrayData Slice(int offset, int length)
length = Math.Min(Length - offset, length);
offset += Offset;
- return new ArrayData(DataType, length, -1, offset, Buffers,
Children);
+ return new ArrayData(DataType, length, NullCount, offset, Buffers,
Children);
Review comment:
This isn't correct because we are slicing a subset of the `ArrayData`.
If the total `ArrayData` has values:
`0`, `null`, `2`, `3`, `4`
`NullCount` for the outer `ArrayData` will be `1`.
But if we are slicing from `offset = 2` and `length = 3` and creating a new
`ArrayData` from the above, the new `ArrayData` won't have any nulls in it. But
with this change, it will have `NullCount = 1` from its parent.
`NullCount = -1` in the original code means "we don't know yet, it needs to
be computed". That was done so we don't have to count the nulls in the sliced
values until it is necessary. Since there are cases where it may not be
required. #Closed
##########
File path: csharp/src/Apache.Arrow/Arrays/ArrayData.cs
##########
@@ -22,6 +22,8 @@ namespace Apache.Arrow
{
public sealed class ArrayData : IDisposable
{
+ public static readonly int RecalculateNullCount = -1;
Review comment:
Can we leave this `private` for now? We can always make it public later.
#Closed
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]