RussellSpitzer commented on code in PR #10953:
URL: https://github.com/apache/iceberg/pull/10953#discussion_r1803845635
##########
arrow/src/main/java/org/apache/iceberg/arrow/vectorized/VectorHolder.java:
##########
@@ -140,12 +141,21 @@ public static class ConstantVectorHolder<T> extends
VectorHolder {
private final int numRows;
public ConstantVectorHolder(int numRows) {
+ super(new NullVector("_dummy_", numRows), null, new
NullabilityHolder(numRows));
+ nullabilityHolder().setNulls(0, numRows);
this.numRows = numRows;
this.constantValue = null;
}
public ConstantVectorHolder(Types.NestedField icebergField, int numRows, T
constantValue) {
- super(icebergField);
+ super(
Review Comment:
Ok having gone through the code I think I may have a strategy here but
please let me know what i'm missing in this interpretation.
ConstantVectorHolder is used as a dummy holder prior to this commit. It has
no *vector* , column descriptor, etc ... everything is null. The current
approach sometimes makes a ConstantVectorHolder which sometimes *does* have
something inside of it. This is done so that GenericArrowVectorAccessorFactory
will have something to work with.
From what I can see in GenericArrowVectorAccessorFactory is that it doesn't
work with ConstantVectorHolders at all. It currently can only handle cases in
which holder.vector() is a non null type and matches some known vector type.
Spark on the other hand, does not actually use this path and when it sees a
ConstantVectorHolder it instead creates a o.a.i.s.ConstantColumnVector. This is
why I don't think Spark has any issue with this. (cc @amogh-jahagirdar )
Now this makes me think what we really need is to just fix
GenericArrowVectorAccessorFactory to handle the case where the "vector" is null
and in that circumstance attempt to cast the VectorHolder to a
ConstantVectorHolder and create the appropriate vector type or accessor. I
think we have a few options to do this.
I think the easiest may be to just create an accessor that looks like the
Spark ConstantColumnVector and just use that as our generic accessor.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]