github-actions[bot] commented on code in PR #63182:
URL: https://github.com/apache/doris/pull/63182#discussion_r3237548977
##########
fe/fe-core/src/main/java/org/apache/doris/nereids/load/NereidsStreamLoadPlanner.java:
##########
@@ -336,4 +340,32 @@ public TPipelineFragmentParams plan(TUniqueId loadId, int
fragmentInstanceIdInde
params.setIsMowTable(destTable.getEnableUniqueKeyMergeOnWrite());
return params;
}
+
+ static void validateStreamLoadForFlexiblePartialUpdate(NereidsLoadTaskInfo
taskInfo) throws UserException {
+ if (taskInfo.getFormatType() != TFileFormatType.FORMAT_JSON) {
+ throw new UserException("flexible partial update only support json
format as input file currently");
+ }
+ if (taskInfo.isFuzzyParse()) {
+ throw new UserException("Don't support flexible partial update
when 'fuzzy_parse' is enabled");
+ }
+ if (!taskInfo.getColumnExprDescs().descs.isEmpty()) {
+ throw new UserException("Don't support flexible partial update
when 'columns' is specified");
+ }
+ if (taskInfo.getJsonPaths() != null &&
!taskInfo.getJsonPaths().isEmpty()) {
+ throw new UserException("Don't support flexible partial update
when 'jsonpaths' is specified");
+ }
+ if (taskInfo.getHiddenColumns() != null &&
!taskInfo.getHiddenColumns().isEmpty()) {
+ throw new UserException("Don't support flexible partial update
when 'hidden_columns' is specified");
+ }
+ if (taskInfo.hasSequenceCol()) {
+ throw new UserException("Don't support flexible partial update
when "
+ + "'function_column.sequence_col' is specified");
+ }
Review Comment:
This checks only the normalized merge type value, so an explicitly supplied
`merge_type=APPEND` is accepted even though the error and the new regression
case require rejecting any specified `merge_type`. `NereidsStreamLoadTask`
drops the `request.isSetMergeType()` bit after parsing and stores only the
enum, whose default is also `APPEND`, so the test added at
`test_flexible_partial_update_restricts.groovy:146` will plan successfully
instead of returning `Don't support flexible partial update when 'merge_type'
is specified`. Please carry a `mergeTypeSpecified` flag, or validate directly
while the request is parsed, and reject it for flexible partial update
regardless of the enum value.
##########
fe/fe-core/src/main/java/org/apache/doris/load/routineload/RoutineLoadJob.java:
##########
@@ -2017,6 +2022,9 @@ public void gsonPostProcess() throws IOException {
this.state = JobState.CANCELLED;
Review Comment:
This only restores `columnDescs` after reparsing `origStmt`, but
`setRoutineLoadDesc()` above can also overwrite every other descriptor field
that was already loaded from the image: separators, filters, partition info,
delete condition, merge type, and sequence column. A concrete restart path is:
create a routine load with `COLUMNS TERMINATED BY ','`, alter it to `COLUMNS
TERMINATED BY '|'`, checkpoint, then restart from image. Gson loads the altered
`columnSeparator`, but `gsonPostProcess()` reparses the original CREATE
statement and restores only `columnDescs`, so the separator reverts to `,`.
Please preserve/restore the complete persisted `RoutineLoadDesc` state in the
image post-process path, or avoid reapplying descriptor fields from `origStmt`
when the image already contains newer altered state. This is distinct from the
edit-log replay thread because this path runs after loading a checkpoint image,
not while replaying the alter journal.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]