wuchong commented on code in PR #2537:
URL: https://github.com/apache/fluss/pull/2537#discussion_r2750734980
##########
fluss-lake/fluss-lake-lance/src/test/java/org/apache/fluss/lake/lance/tiering/LanceTieringTest.java:
##########
@@ -322,4 +325,140 @@ private Schema createTable(LanceConfig config) {
return schema;
}
+
+ @Test
+ void testTieringWriteTableWithArrayType() throws Exception {
Review Comment:
This test looks unusual—it manually instantiates components like
`LakeWriter` and `LakeCommitter` using `LanceLakeTieringFactory` to perform
tiering. However, this approach is highly error-prone and doesn’t represent the
actual end-to-end path that needs to be covered when adding support for a new
type like `ARRAY`. Consider this: if we were to add 10 new types, would we
really need to write 10 separate, hand-wired integration tests like this? Such
tests are not maintainable.
To properly validate `ARRAY` type support, we should perform an end-to-end
verification. Specifically, we can simply add a few `ARRAY`-type columns (like
`array<string>` and `array<int>`) to the `logTable` created in
`org.apache.fluss.lake.lance.tiering.LanceTieringITCase#testTiering`, and then
verify that these array fields are correctly written and read in the result
assertions.
Additionally, for the Lance format, `ARRAY` types are especially important
for **vector embeddings**. Therefore, we should also include a dedicated test
case for `ARRAY<FLOAT>` to ensure proper handling of vector data.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]