zhongyujiang commented on code in PR #6836:
URL: https://github.com/apache/iceberg/pull/6836#discussion_r1107953666
##########
parquet/src/test/java/org/apache/iceberg/parquet/TestDictionaryRowGroupFilter.java:
##########
@@ -1030,7 +1030,7 @@ public void testMissingDictionaryPageForColumn() {
IllegalStateException.class,
"Failed to read required dictionary page for id: 5",
() ->
- new ParquetDictionaryRowGroupFilter(SCHEMA, notEqual("some_nulls",
"some"))
+ new ParquetDictionaryRowGroupFilter(SCHEMA, equal("some_nulls",
"some"))
Review Comment:
Thanks for reviewing!
I checked the PR #93 that introduced this UT and the relevant
[discussion](https://github.com/apache/iceberg/pull/86#discussion_r253802993).
Basically it wants to throw an exception when no dict is available to avoid NPE
after calling `dict`. But as @rdblue mentioned
[here](https://github.com/apache/iceberg/pull/86#discussion_r253943635), all
runs of `dict` in `ParquetDictionaryRowGroupFilter` is guaranteed that there
must be a dictionary. So under normal circumstances, this exception will not be
thrown out I think. The reason why this UT can catch the exception is because
it mocked an abnormal
[DictionaryPageReadStore](https://github.com/apache/iceberg/blob/49d833aa83f6d4681c6c9f8473080f6e78124a0e/parquet/src/test/java/org/apache/iceberg/parquet/TestDictionaryRowGroupFilter.java#L1034).
Above is the context about this UT. So I think replacing it with `eq` can
still serve the original purpose of verification, right? So I think moving
null-check is reasonable because the run of `notEq` or `notIn` is guaranteed
there will be a dictionary there. Moving it forward can help avoiding the
dictionary decoding overhead when the column has nulls.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]