wuchong commented on code in PR #2537:
URL: https://github.com/apache/fluss/pull/2537#discussion_r2750734980


##########
fluss-lake/fluss-lake-lance/src/test/java/org/apache/fluss/lake/lance/tiering/LanceTieringTest.java:
##########
@@ -322,4 +325,140 @@ private Schema createTable(LanceConfig config) {
 
         return schema;
     }
+
+    @Test
+    void testTieringWriteTableWithArrayType() throws Exception {

Review Comment:
   This test looks unusual—it manually instantiates components like 
`LakeWriter` and `LakeCommitter` using `LanceLakeTieringFactory` to perform 
tiering. However, this approach is highly error-prone and doesn’t represent the 
actual end-to-end path that needs to be covered when adding support for a new 
type like `ARRAY`. Consider this: if we were to add 10 new types, would we 
really need to write 10 separate, hand-wired integration tests like this? Such 
tests are not maintainable.
   
   To properly validate `ARRAY` type support, we should perform an end-to-end 
verification. Specifically, we can simply add a few `ARRAY`-type columns (like 
`array<string>` and `array<int>`) to the `logTable` created in 
`org.apache.fluss.lake.lance.tiering.LanceTieringITCase#testTiering`, and then 
verify that these array fields are correctly written and read in the result 
assertions.
   
   Additionally, for the Lance format, `ARRAY` types are especially important 
for **vector embeddings**. Therefore, we should also include a dedicated test 
case for `ARRAY<FLOAT>` to ensure proper handling of vector data.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to