snuyanzin commented on code in PR #21947: URL: https://github.com/apache/flink/pull/21947#discussion_r1108331796
########## flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/BuiltInFunctionDefinitions.java: ########## @@ -178,6 +178,35 @@ ANY, and(logical(LogicalTypeRoot.BOOLEAN), LITERAL) .runtimeClass( "org.apache.flink.table.runtime.functions.scalar.ArrayDistinctFunction") .build(); + + public static final BuiltInFunctionDefinition ARRAY_SIZE = + BuiltInFunctionDefinition.newBuilder() + .name("ARRAY_SIZE") + .kind(SCALAR) + .inputTypeStrategy( + sequence( + Collections.singletonList("haystack"), + Collections.singletonList(logical(LogicalTypeRoot.ARRAY)))) + .outputTypeStrategy( + nullableIfArgs(ConstantArgumentCount.of(0), explicit(DataTypes.INT()))) + .runtimeClass( + "org.apache.flink.table.runtime.functions.scalar.ArraySizeFunction") + .build(); + + public static final BuiltInFunctionDefinition ARRAY_REMOVE = + BuiltInFunctionDefinition.newBuilder() + .name("ARRAY_REMOVE") + .kind(SCALAR) + .inputTypeStrategy( + sequence( + Arrays.asList("haystack", "needle"), + Arrays.asList( + logical(LogicalTypeRoot.ARRAY), ARRAY_ELEMENT_ARG))) Review Comment: `ARRAY_ELEMENT_ARG` not sure about this since array could consists of non null elements and a query could try to remove `null` e.g. ```sql SELECT array_remove(array['abc', 'klm', 'xyz'], cast(null as varchar)); ``` ########## flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/functions/CollectionFunctionsITCase.java: ########## @@ -160,6 +160,142 @@ Stream<TestSetSpec> getTestSetSpecs() { null }, DataTypes.ARRAY( - DataTypes.ROW(DataTypes.BOOLEAN(), DataTypes.DATE())))); + DataTypes.ROW(DataTypes.BOOLEAN(), DataTypes.DATE()))), + TestSetSpec.forFunction(BuiltInFunctionDefinitions.ARRAY_SIZE) + .onFieldsWithData( + new Integer[] {1, 2, 3}, + new Integer[] {null, 1, 2, 3, 4, 5, 4, 3, 2, 1, null}, + null, + new String[] {"Hello", "Hello", "Hello"}, + new Row[] { + Row.of(true, LocalDate.of(2022, 4, 20)), + Row.of(true, LocalDate.of(1990, 10, 14)), + Row.of(true, LocalDate.of(1990, 10, 14)), + Row.of(true, LocalDate.of(1990, 10, 14)), + null + }) + .andDataTypes( + DataTypes.ARRAY(DataTypes.INT()), + DataTypes.ARRAY(DataTypes.INT()), + DataTypes.ARRAY(DataTypes.INT()), + DataTypes.ARRAY(DataTypes.STRING()).notNull(), + DataTypes.ARRAY( + DataTypes.ROW(DataTypes.BOOLEAN(), DataTypes.DATE()))) + .testResult( + $("f0").arraySize(), + "ARRAY_SIZE(f0)", + 3, + DataTypes.INT().nullable()) + .testResult( + $("f1").arraySize(), + "ARRAY_SIZE(f1)", + 11, + DataTypes.INT().nullable()) + .testResult( + $("f2").arraySize(), + "ARRAY_SIZE(f2)", + null, + DataTypes.INT().nullable()) + // ARRAY<STRING> NOT NULL + .testResult( + $("f3").arraySize(), "ARRAY_SIZE(f3)", 3, DataTypes.INT().notNull()) + .testResult( + $("f4").arraySize(), + "ARRAY_SIZE(f4)", + 5, + DataTypes.INT().nullable()), + TestSetSpec.forFunction(BuiltInFunctionDefinitions.ARRAY_REMOVE) + .onFieldsWithData( + new Integer[] {1, 2, 2}, + null, + new String[] {"Hello", "World"}, + new Row[] { + Row.of(true, LocalDate.of(2022, 4, 20)), + Row.of(true, LocalDate.of(1990, 10, 14)), + null + }, + new Integer[] {null, null, 1}) + .andDataTypes( + DataTypes.ARRAY(DataTypes.INT()), + DataTypes.ARRAY(DataTypes.INT()), + DataTypes.ARRAY(DataTypes.STRING()).notNull(), + DataTypes.ARRAY( + DataTypes.ROW(DataTypes.BOOLEAN(), DataTypes.DATE())), + DataTypes.ARRAY(DataTypes.INT())) + // ARRAY<INT> + .testResult( + $("f0").arrayRemove(2), + "ARRAY_REMOVE(f0, 2)", + new Integer[] {1}, + DataTypes.ARRAY(DataTypes.INT()).nullable()) + .testResult( + $("f0").arrayRemove(42), + "ARRAY_REMOVE(f0, 42)", + new Integer[] {1, 2, 2}, + DataTypes.ARRAY(DataTypes.INT()).nullable()) + // ARRAY<INT> of null value + .testResult( + $("f1").arrayRemove(12), + "ARRAY_REMOVE(f1, 12)", + null, + DataTypes.ARRAY(DataTypes.INT()).nullable()) + .testResult( + $("f1").arrayRemove(null), + "ARRAY_REMOVE(f1, NULL)", + null, + DataTypes.ARRAY(DataTypes.INT()).nullable()) + // ARRAY<STRING> NOT NULL + .testResult( + $("f2").arrayRemove("Hello"), + "ARRAY_REMOVE(f2, 'Hello')", + new String[] {"World"}, + DataTypes.ARRAY(DataTypes.STRING()).notNull()) + // ARRAY<ROW<BOOLEAN, DATE>> + .testResult( + $("f3").arrayRemove(row(true, LocalDate.of(1990, 10, 14))), + "ARRAY_REMOVE(f3, (TRUE, DATE '1990-10-14'))", + new Row[] {Row.of(true, LocalDate.of(2022, 4, 20)), null}, + DataTypes.ARRAY( + DataTypes.ROW( + DataTypes.BOOLEAN(), DataTypes.DATE())) + .nullable()) + .testResult( + $("f3").arrayRemove(row(false, LocalDate.of(1990, 10, 14))), + "ARRAY_REMOVE(f3, (FALSE, DATE '1990-10-14'))", + new Row[] { + Row.of(true, LocalDate.of(2022, 4, 20)), + Row.of(true, LocalDate.of(1990, 10, 14)), + null + }, + DataTypes.ARRAY( + DataTypes.ROW( + DataTypes.BOOLEAN(), DataTypes.DATE())) + .nullable()) + .testResult( + $("f3").arrayRemove(null), + "ARRAY_REMOVE(f3, null)", + new Row[] { + Row.of(true, LocalDate.of(2022, 4, 20)), + Row.of(true, LocalDate.of(1990, 10, 14)), + }, + DataTypes.ARRAY( + DataTypes.ROW( + DataTypes.BOOLEAN(), DataTypes.DATE())) + .nullable()) + // ARRAY<INT> with null elements + .testResult( Review Comment: I would suggest to add more complex cases like removal from array of arrays or array of maps or array of rows ########## flink-python/docs/reference/pyflink.table/expressions.rst: ########## @@ -226,6 +226,8 @@ advanced type helper functions Expression.element Expression.array_contains Expression.array_distinct + Expression.array_size + Expression.array_remove Review Comment: I would recommend keeping abc order (at least for `array_*` expressions) here since it simplifies search/navigation ########## docs/data/sql_functions.yml: ########## @@ -617,6 +617,12 @@ collection: - sql: ARRAY_DISTINCT(haystack) table: haystack.arrayDistinct() description: Returns an array with unique elements. If the array itself is null, the function will return null. Keeps ordering of elements. + - sql: ARRAY_SIZE(haystack) + table: haystack.arraySize() + description: Returns the size of an array. If the array itself is null, the function will return null. + - sql: ARRAY_REMOVE(haystack) Review Comment: didn't get ... shouldn't `ARRAY_REMOVE` have 2 args? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org