github-actions[bot] commented on code in PR #63001:
URL: https://github.com/apache/doris/pull/63001#discussion_r3260858575


##########
be/src/core/column/column_const.cpp:
##########
@@ -108,7 +109,10 @@ void ColumnConst::get_permutation(bool /*reverse*/, size_t 
/*limit*/, int /*nan_
 }
 
 void ColumnConst::replace_float_special_values() {
-    data->replace_float_special_values();
+    // COW: get exclusive ownership of data before mutating

Review Comment:
   This moves the live nested column out of `ColumnConst::data` before the 
mutation is guaranteed to finish. If `IColumn::mutate()` has to clone and 
throws, or if `replace_float_special_values()` throws while normalizing a 
nested nullable/complex column, `data` remains moved-from and the `ColumnConst` 
object is left invalid under the surrounding catch-and-convert paths. Please 
detach from a copied `ColumnPtr` (or use a restore guard) and assign back only 
after the throwing work succeeds.



##########
be/src/exec/common/variant_util.cpp:
##########
@@ -2147,10 +2146,15 @@ Status _parse_and_materialize_variant_columns(Block& 
block,
     for (size_t i = 0; i < variant_pos.size(); ++i) {
         auto column_ref = block.get_by_position(variant_pos[i]).column;
         bool is_nullable = column_ref->is_nullable();

Review Comment:
   This detaches a clone from the block slot, but the non-scalar `continue` 
path below drops that clone after `finalize()`. `column_ref` is a copy of 
`block.get_by_position(...).column`, so the block still owns the original 
column while `mutate()` can finalize a detached copy; non-scalar variants then 
skip the assignment at the end and lose the finalize side effects. Please write 
the finalized owner back to the block before continuing, or only 
detach/finalize on the scalar-conversion path that actually replaces the slot. 
This is distinct from the existing nullable-variant COW thread because it 
affects the already-parsed/non-scalar early-exit path.



##########
be/src/core/column/column_variant.cpp:
##########
@@ -2101,7 +2114,24 @@ void ColumnVariant::clear_sparse_column() {
     }
 #endif
 
-    serialized_sparse_column->clear();
+    serialized_sparse_column = ColumnPtr(create_binary_column_fn());
+}
+
+void ColumnVariant::ensure_binary_columns_rows() {
+    auto resize_if_empty = [this](WrappedPtr& column) {
+        const auto& const_column = static_cast<const IColumn::Ptr&>(column);
+        if (const_column->size() == num_rows) {

Review Comment:
   `ensure_binary_columns_rows()` moves the live binary column member out 
before `resize(num_rows)` succeeds. If detaching or resizing the 
sparse/doc-value binary column throws, the corresponding `ColumnVariant` member 
stays moved-from, and callers such as `finalize()` continue unwinding with a 
partially corrupted variant. Please resize a detached temporary from a copied 
pointer, or install a scoped restore guard before consuming the member.



##########
be/src/core/block/block.cpp:
##########
@@ -655,9 +815,26 @@ void Block::clear_column_data(int64_t column_size) 
noexcept {
     }
     for (auto& d : data) {
         if (d.column) {
-            // Temporarily disable reference count check because a column 
might be referenced multiple times within a block.

Review Comment:
   This fast path only checks top-level exclusivity, but composite columns can 
be uniquely owned at the parent while sharing child columns. For example, a 
cloned `ColumnArray` can have an exclusive array wrapper with shared 
data/offset children; `assume_mutable()->clear()` then clears the shared 
children and corrupts the alias. The new `scoped_mutate_column()` already uses 
recursive exclusivity for the same reason, so `clear_column_data()` should 
either use that recursive check or clone an empty column whenever the whole 
column tree is not exclusive. The vector overload below has the same pattern.



##########
be/src/core/column/column_const.h:
##########
@@ -321,7 +322,11 @@ class ColumnConst final : public COWHelper<IColumn, 
ColumnConst> {
 
     size_t deserialize_impl(const char* pos) override {
         ++s;
-        return data->deserialize_impl(pos);

Review Comment:
   `deserialize_impl()` now increments `s` and moves `data` out before nested 
deserialization succeeds. A malformed input or allocation failure inside 
`IColumn::mutate()`/`mutable_data->deserialize_impl(pos)` leaves the const 
column with `s` advanced and/or `data` null because the write-back is skipped. 
Please mutate a copied owner and update both `data` and `s` only after 
successful nested deserialization, or restore them on every exception path.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to