github-actions[bot] commented on code in PR #65127:
URL: https://github.com/apache/doris/pull/65127#discussion_r3507207998
##########
fe/fe-core/src/main/java/org/apache/doris/nereids/load/NereidsDataDescription.java:
##########
@@ -990,9 +990,10 @@ private void checkLoadPriv(String fullDbName) throws
AnalysisException {
}
}
- // Change all the columns name to lower case, because Doris column is
case-insensitive.
+ // Change text-format column names to lower case, because Doris column is
case-insensitive.
private void columnsNameToLowerCase(List<String> columns) {
- if (columns == null || columns.isEmpty() ||
"json".equals(analysisMap.get(FileFormatProperties.PROP_FORMAT))) {
+ String format = analysisMap.get(FileFormatProperties.PROP_FORMAT);
+ if (columns == null || columns.isEmpty() || "json".equals(format) ||
"arrow".equals(format)) {
return;
Review Comment:
This still misses an accepted Arrow spelling. For broker/load SQL,
`LogicalPlanBuilder` passes the raw `FORMAT AS` token into this constructor,
and this method runs before `FileFormatProperties.createFileFormatProperties()`
normalizes the value case-insensitively. So `FORMAT AS ARROW` (or quoted
`Arrow`) is accepted later as `FORMAT_ARROW`, but this guard does not fire and
`COLUMNS(EV)` is still rewritten to `ev`. The BE Arrow reader then compares the
block slot name to the Arrow schema field name exactly, so a file field named
`EV` still fails. Please make this preserve-name check case-insensitive, or
normalize the format before lowercasing column names, and add coverage for a
non-lowercase Arrow format spelling.
##########
regression-test/suites/load_p0/stream_load/test_arrow_stream_load_uppercase_column.groovy:
##########
@@ -0,0 +1,91 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+suite("test_arrow_stream_load_uppercase_column", "p0") {
+ sql """DROP TABLE IF EXISTS
test_arrow_stream_load_uppercase_column_explicit"""
+ sql """
+ CREATE TABLE test_arrow_stream_load_uppercase_column_explicit (
+ `id` INT,
+ `EV` DOUBLE
+ ) ENGINE=OLAP
+ DUPLICATE KEY(`id`)
+ DISTRIBUTED BY HASH(`id`) BUCKETS 1
+ PROPERTIES ("replication_allocation" = "tag.location.default: 1");
+ """
+
+ streamLoad {
+ table "test_arrow_stream_load_uppercase_column_explicit"
+ set 'format', 'arrow'
+ set 'columns', 'id,EV'
+ file 'arrow_uppercase_column.arrow'
+ time 10000
+ check { result, exception, startTime, endTime ->
+ if (exception != null) {
+ throw exception
+ }
+ log.info("Stream load result: ${result}".toString())
+ def json = parseJson(result)
+ assertEquals("success", json.Status.toLowerCase())
+ assertEquals(2, json.NumberTotalRows)
+ assertEquals(0, json.NumberFilteredRows)
+ }
+ }
+
+ sql "sync"
+ def explicitRows = sql """
+ SELECT concat(cast(id AS STRING), ':', cast(EV AS STRING))
+ FROM test_arrow_stream_load_uppercase_column_explicit
+ ORDER BY id
+ """
+ assertEquals([["1:2.5"], ["2:3.5"]], explicitRows)
+
Review Comment:
This deterministic result check should go through the regression result
framework instead of a Groovy assertion. The project test rules ask for
determined expected results to be generated with `qt_`/`order_qt_` output
rather than `assert`, and this new suite currently has no `.out` file for the
loaded rows. Please keep the stream-load status checks as needed, but make the
ordered row query an `order_qt_...`/`qt_...` check and commit the generated
`.out` for this suite.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]