Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
doris-robot commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2931354359 TPC-H: Total hot run time: 34498 ms ``` machine: 'aliyun_ecs.c7a.8xlarge_32C64G' scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools Tpch sf100 test result on commit a439492bd3bfb6fafdab904ac61d3030eb79ddf9, data reload: false -- Round 1 -- q1 26263 514950755075 q2 1968289 185 185 q3 10380 1250755 755 q4 10242 995 548 548 q5 7632254423912391 q6 184 162 136 136 q7 924 767 627 627 q8 9302133811481148 q9 6732507851525078 q10 6855231119421942 q11 492 317 283 283 q12 352 367 225 225 q13 17804 370631453145 q14 237 236 207 207 q15 557 485 494 485 q16 453 443 378 378 q17 627 877 385 385 q18 7690723872347234 q19 1829968 594 594 q20 342 342 230 230 q21 3912323124852485 q22 1056989 962 962 Total cold run time: 115833 ms Total hot run time: 34498 ms - Round 2, with runtime_filter_mode=off - version_comment Doris version doris-0.0.0--a439492bd3 Doris version doris-0.0.0--a439492bd3 0 wait_timeout 28800 28800 0 workload_group 0 show table status; Name Engine Version Row_format RowsAvg_row_length Data_length Max_data_length Index_lengthData_free Auto_increment Create_time Update_time Check_time Collation ChecksumCreate_options Comment revenue0 ViewNULLNULL-1 0 0 NULL0 NULLNULL2023-12-26 18:27:24 NULLNULLutf-8 NULLNULL partsupp Doris NULLNULL800056 4534120086 NULL44625495NULLNULL2023-12-26 18:27:23 2023-12-26 18:44:20 NULLutf-8 NULLNULL part Doris NULLNULL200037 748811035 NULL 1935627 NULLNULL2023-12-26 18:27:23 2023-12-26 18:27:57 NULL utf-8 NULLNULL nation Doris NULLNULL25 138 3473NULL366 NULLNULL2023-12-26 18:27:23 2023-12-26 18:27:24 NULLutf-8 NULLNULL customer Doris NULLNULL150092 1381653732 NULL4374759 NULLNULL2023-12-26 18:27:23 2023-12-26 18:27:43 NULLutf-8 NULLNULL lineitem Doris NULLNULL600037902 33 19843441616 NULL61784740NULLNULL2023-12-26 18:27:23 2023-12-26 18:38:59 NULLutf-8 NULLNULL supplier Doris NULLNULL100 87 87519212NULL 194931 NULLNULL2023-12-26 18:27:23 2023-12-26 18:27:25 NULL utf-8 NULLNULL region Doris NULLNULL5 240 1201NULL147 NULLNULL2023-12-26 18:27:23 2023-12-26 18:27:24 NULLutf-8 NULLNULL orders Doris NULLNULL15000 42 6422171781 NULL22778155NULLNULL2023-12-26 18:27:23 2023-12-26 18:42:55 NULLutf-8 NULLNULL q1 5234513352055133 q2 243 316 222 222 q3 2159269223432343 q4 1349183014821482 q5 4468440944094409 q6 234 170 131 131 q7 2136q8 q9 q10 q11 q12 q13 q14 q15 q16 q17 q18 q19 q20 q21 q22 Total cold run time: 13687 ms Total hot run time: 13720 ms ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
doris-robot commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2930828804 # Cloud UT Coverage Report Increment line coverage ` ` :tada: [Increment coverage report](http://coverage.selectdb-in.cc/coverage/a439492bd3bfb6fafdab904ac61d3030eb79ddf9_a439492bd3bfb6fafdab904ac61d3030eb79ddf9_cloud/increment_report/index.html) [Complete coverage report](http://coverage.selectdb-in.cc/coverage/a439492bd3bfb6fafdab904ac61d3030eb79ddf9_a439492bd3bfb6fafdab904ac61d3030eb79ddf9_cloud/report/index.html) | Category | Coverage | |---|| | Function Coverage | 83.31% (1118/1342) | | Line Coverage | 66.75% (19132/28663) | | Region Coverage | 66.38% (9467/14261) | | Branch Coverage | 56.31% (5128/9106) | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
wyxxxcat commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2930767577 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
wyxxxcat commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2930747866 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
github-actions[bot] commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2912462529 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
zhannngchen commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2912460889 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
doris-robot commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2912976139 TPC-H: Total hot run time: 33837 ms ``` machine: 'aliyun_ecs.c7a.8xlarge_32C64G' scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools Tpch sf100 test result on commit c072b09b56f0fd649cdfaac0f1faa190b2fef4f3, data reload: false -- Round 1 -- q1 26243 505349874987 q2 2081279 181 181 q3 10542 1273699 699 q4 10226 1004521 521 q5 7749241923382338 q6 181 160 130 130 q7 923 746 616 616 q8 9319128010771077 q9 6844511850875087 q10 6869233418931893 q11 496 286 271 271 q12 347 351 220 220 q13 17800 364730753075 q14 242 221 216 216 q15 545 479 496 479 q16 430 438 375 375 q17 615 855 382 382 q18 7788725971437143 q19 1707984 581 581 q20 332 340 234 234 q21 3724323323962396 q22 996 1001936 936 Total cold run time: 115999 ms Total hot run time: 33837 ms - Round 2, with runtime_filter_mode=off - show table status; Name Engine Version Row_format RowsAvg_row_length Data_length Max_data_length Index_lengthData_free Auto_increment Create_time Update_time Check_time Collation ChecksumCreate_options Comment revenue0 ViewNULLNULL-1 0 0 NULL0 NULLNULL2023-12-26 18:27:24 NULLNULLutf-8 NULLNULL partsupp Doris NULLNULL800056 4534120086 NULL44625495NULLNULL2023-12-26 18:27:23 2023-12-26 18:44:20 NULLutf-8 NULLNULL part Doris NULLNULL200037 748811035 NULL 1935627 NULLNULL2023-12-26 18:27:23 2023-12-26 18:27:57 NULL utf-8 NULLNULL nation Doris NULLNULL25 138 3473NULL366 NULLNULL2023-12-26 18:27:23 2023-12-26 18:27:24 NULLutf-8 NULLNULL customer Doris NULLNULL150092 1381653732 NULL4374759 NULLNULL2023-12-26 18:27:23 2023-12-26 18:27:43 NULLutf-8 NULLNULL lineitem Doris NULLNULL600037902 33 19843441616 NULL61784740NULLNULL2023-12-26 18:27:23 2023-12-26 18:38:59 NULLutf-8 NULLNULL supplier Doris NULLNULL100 87 87519212NULL 194931 NULLNULL2023-12-26 18:27:23 2023-12-26 18:27:25 NULL utf-8 NULLNULL region Doris NULLNULL5 240 1201NULL147 NULLNULL2023-12-26 18:27:23 2023-12-26 18:27:24 NULLutf-8 NULLNULL orders Doris NULLNULL15000 42 6422171781 NULL22778155NULLNULL2023-12-26 18:27:23 2023-12-26 18:42:55 NULLutf-8 NULLNULL q1 5171509550955095 q2 243 331 229 229 q3 2134268223042304 q4 1349176513351335 q5 4596442743874387 q6 214 166 127 127 q7 2025197717401740 q8 2589252625172517 q9 7202718670677067 q10 30013230q11 q12 q13 q14 q15 q16 q17 q18 q19 q20 q21 q22 Total cold run time: 25523 ms Total hot run time: 24801 ms ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
doris-robot commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2912533743 # Cloud UT Coverage Report Increment line coverage ` ` :tada: [Increment coverage report](http://coverage.selectdb-in.cc/coverage/c072b09b56f0fd649cdfaac0f1faa190b2fef4f3_c072b09b56f0fd649cdfaac0f1faa190b2fef4f3_cloud/increment_report/index.html) [Complete coverage report](http://coverage.selectdb-in.cc/coverage/c072b09b56f0fd649cdfaac0f1faa190b2fef4f3_c072b09b56f0fd649cdfaac0f1faa190b2fef4f3_cloud/report/index.html) | Category | Coverage | |---|| | Function Coverage | 83.25% (1113/1337) | | Line Coverage | 66.13% (18664/28224) | | Region Coverage | 65.80% (9264/14079) | | Branch Coverage | 55.53% (4979/8966) | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
github-actions[bot] commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2912462415 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
zhannngchen commented on code in PR #51001: URL: https://github.com/apache/doris/pull/51001#discussion_r2103718015 ## be/test/olap/delta_writer_test.cpp: ## @@ -1047,4 +1050,195 @@ TEST_F(TestDeltaWriter, vec_sequence_col_concurrent_write) { res = engine_ref->tablet_manager()->drop_tablet(request.tablet_id, request.replica_id, false); ASSERT_TRUE(res.ok()); } + +TEST_F(TestDeltaWriter, write_sigle_block_statistics_segment_meta_pb) { Review Comment: should also test on complex types, such as array, map, jsonb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
zhannngchen commented on code in PR #51001: URL: https://github.com/apache/doris/pull/51001#discussion_r2103716900 ## be/src/olap/rowset/segment_v2/segment_writer.cpp: ## @@ -1030,9 +1034,14 @@ Status SegmentWriter::finalize_columns_data() { } _num_rows_written = 0; +int64_t total_data_size = 0; for (auto& column_writer : _column_writers) { +// record the data size of each column before page builder reset in finish() +total_data_size += column_writer->estimate_buffer_size(); RETURN_IF_ERROR(column_writer->finish()); } +auto origin_data_footprint = _footer.data_footprint(); +_footer.set_data_footprint(origin_data_footprint + total_data_size); Review Comment: it's not accurate, don't update data footprint field -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
zhannngchen commented on code in PR #51001: URL: https://github.com/apache/doris/pull/51001#discussion_r2103696654 ## gensrc/proto/segment_v2.proto: ## @@ -197,6 +197,8 @@ message ColumnMetaPB { optional bool result_is_nullable = 18; // used on agg_state type optional string function_name = 19; // used on agg_state type optional int32 be_exec_version = 20; // used on agg_state type + +optional uint64 total_data_size = 21; Review Comment: `estimate_data_size`, and add comment on this field -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
zhannngchen commented on code in PR #51001: URL: https://github.com/apache/doris/pull/51001#discussion_r2103695664 ## be/src/olap/rowset/segment_v2/segment_writer.cpp: ## @@ -811,7 +811,11 @@ Status SegmentWriter::append_block(const vectorized::Block* block, size_t row_po } RETURN_IF_ERROR(_column_writers[id]->append(converted_result.second->get_nullmap(), converted_result.second->get_data(), num_rows)); +// estimate column data size for flush memtable, may be inaccurate at low cardinality +_footer.mutable_columns(cid)->set_total_data_size( Review Comment: move to L1039 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
doris-robot commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2897913323 TPC-H: Total hot run time: 34076 ms ``` machine: 'aliyun_ecs.c7a.8xlarge_32C64G' scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools Tpch sf100 test result on commit c7f748323d3e664e5add4a06d1b3c1f6882078ea, data reload: false -- Round 1 -- q1 26336 516450865086 q2 2078272 188 188 q3 10582 1267711 711 q4 10232 983 515 515 q5 7772226524352265 q6 189 164 135 135 q7 923 761 633 633 q8 9336130311241124 q9 6802505950875059 q10 6816231618981898 q11 477 286 289 286 q12 348 352 223 223 q13 17791 367031233123 q14 235 240 224 224 q15 547 504 503 503 q16 439 433 379 379 q17 601 859 371 371 q18 7815722771507150 q19 1813946 583 583 q20 338 340 226 226 q21 3739326724142414 q22 1056997 980 980 Total cold run time: 116265 ms Total hot run time: 34076 ms - Round 2, with runtime_filter_mode=off - version_comment Doris version doris-0.0.0--b1b46766a7 Doris version doris-0.0.0--b1b46766a7 0 wait_timeout 28800 28800 0 workload_group 0 show table status; Name Engine Version Row_format RowsAvg_row_length Data_length Max_data_length Index_lengthData_free Auto_increment Create_time Update_time Check_time Collation ChecksumCreate_options Comment revenue0 ViewNULLNULL-1 0 0 NULL0 NULLNULL2023-12-26 18:27:24 NULLNULLutf-8 NULLNULL partsupp Doris NULLNULL800056 4534120086 NULL44625495NULLNULL2023-12-26 18:27:23 2023-12-26 18:44:20 NULLutf-8 NULLNULL part Doris NULLNULL200037 748811035 NULL 1935627 NULLNULL2023-12-26 18:27:23 2023-12-26 18:27:57 NULL utf-8 NULLNULL nation Doris NULLNULL25 138 3473NULL366 NULLNULL2023-12-26 18:27:23 2023-12-26 18:27:24 NULLutf-8 NULLNULL customer Doris NULLNULL150092 1381653732 NULL4374759 NULLNULL2023-12-26 18:27:23 2023-12-26 18:27:43 NULLutf-8 NULLNULL lineitem Doris NULLNULL600037902 33 19843441616 NULL61784740NULLNULL2023-12-26 18:27:23 2023-12-26 18:38:59 NULLutf-8 NULLNULL supplier Doris NULLNULL100 87 87519212NULL 194931 NULLNULL2023-12-26 18:27:23 2023-12-26 18:27:25 NULL utf-8 NULLNULL region Doris NULLNULL5 240 1201NULL147 NULLNULL2023-12-26 18:27:23 2023-12-26 18:27:24 NULLutf-8 NULLNULL orders Doris NULLNULL15000 42 6422171781 NULL22778155NULLNULL2023-12-26 18:27:23 2023-12-26 18:42:55 NULLutf-8 NULLNULL q1 5287512551205120 q2 235 329 238 238 q3 2162265822592259 q4 1381181514601460 q5 4451443944154415 q6 216 172 131 131 q7 q8 q9 q10 q11 q12 q13 q14 q15 q16 q17 q18 q19 q20 q21 q22 Total cold run time: 13732 ms Total hot run time: 13623 ms ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
hello-stephen commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2897583259 # Cloud UT Coverage Report Increment line coverage ` ` :tada: [Increment coverage report](http://coverage.selectdb-in.cc/coverage/c7f748323d3e664e5add4a06d1b3c1f6882078ea_c7f748323d3e664e5add4a06d1b3c1f6882078ea_cloud/increment_report/index.html) [Complete coverage report](http://coverage.selectdb-in.cc/coverage/c7f748323d3e664e5add4a06d1b3c1f6882078ea_c7f748323d3e664e5add4a06d1b3c1f6882078ea_cloud/report/index.html) | Category | Coverage | |---|| | Function Coverage | 83.25% (1113/1337) | | Line Coverage | 66.19% (18682/28223) | | Region Coverage | 65.81% (9264/14077) | | Branch Coverage | 55.61% (4985/8964) | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
wyxxxcat commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2897515178 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
doris-robot commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2897216382 ClickBench: Total hot run time: 29.54 s ``` machine: 'aliyun_ecs.c7a.8xlarge_32C64G' scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools ClickBench test result on commit 676c37fb719d9fd8e06a278641bb66f3d17b590f, data reload: false query1 0.040.040.03 query2 0.120.100.11 query3 0.250.200.18 query4 1.590.190.19 query5 0.450.420.41 query6 1.160.670.65 query7 0.030.020.02 query8 0.050.030.04 query9 0.590.510.51 query10 0.560.560.56 query11 0.150.110.11 query12 0.160.120.12 query13 0.620.600.60 query14 0.780.810.81 query15 0.870.870.85 query16 0.380.370.38 query17 1.001.081.04 query18 0.230.210.22 query19 1.891.821.78 query20 0.010.010.02 query21 15.42 0.930.55 query22 0.761.181.02 query23 14.67 1.390.62 query24 7.381.150.83 query25 0.510.160.14 query26 0.520.160.13 query27 0.060.050.05 query28 9.410.830.44 query29 12.60 3.963.34 query30 0.260.080.06 query31 2.820.580.39 query32 3.220.550.48 query33 3.123.073.09 query34 15.88 5.174.51 query35 4.584.574.54 query36 0.640.490.47 query37 0.080.060.06 query38 0.050.030.04 query39 0.040.020.02 query40 0.170.130.13 query41 0.080.030.03 query42 0.030.020.03 query43 0.040.030.03 Total cold run time: 103.27 s Total hot run time: 29.54 s ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
doris-robot commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2897200346 TPC-DS: Total hot run time: 185851 ms ``` machine: 'aliyun_ecs.c7a.8xlarge_32C64G' scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools TPC-DS sf100 test result on commit 676c37fb719d9fd8e06a278641bb66f3d17b590f, data reload: false query1 1001471 496 471 query2 6558179517521752 query3 6765223 226 223 query4 25452 24175 22986 22986 query5 4745632 470 470 query6 309 216 203 203 query7 4630505 290 290 query8 306 250 255 250 query9 8654266026552655 query10 483 333 268 268 query11 15895 15033 14824 14824 query12 170 108 108 108 query13 1658536 442 442 query14 9502606960226022 query15 195 189 166 166 query16 7256657 512 512 query17 1150699 561 561 query18 1984396 295 295 query19 188 225 162 162 query20 124 116 122 116 query21 208 128 110 110 query22 4159433041844184 query23 34043 33218 32998 32998 query24 8493236623862366 query25 529 478 391 391 query26 1235268 153 153 query27 2752501 332 332 query28 4312213021192119 query29 762 553 436 436 query30 282 208 189 189 query31 950 838 784 784 query32 69 68 67 67 query33 549 348 304 304 query34 789 836 523 523 query35 789 797 718 718 query36 964 1028885 885 query37 116 98 76 76 query38 4200411241774112 query39 1513141414101410 query40 212 115 105 105 query41 55 55 52 52 query42 123 114 104 104 query43 507 510 488 488 query44 1305821 815 815 query45 179 170 166 166 query46 837 1016646 646 query47 1792184117721772 query48 388 439 316 316 query49 781 505 447 447 query50 646 675 427 427 query51 4105411240104010 query52 112 109 100 100 query53 223 254 191 191 query54 582 570 504 504 query55 91 86 81 81 query56 317 306 293 293 query57 1157116011071107 query58 274 249 254 249 query59 2445266524832483 query60 323 341 337 337 query61 153 145 146 145 query62 822 730 676 676 query63 231 193 198 193 query64 4423994 681 681 query65 4256426442664264 query66 1124412 335 335 query67 15798 15508 15565 15508 query68 7823871 515 515 query69 479 310 269 269 query70 1241116110581058 query71 434 310 302 302 query72 5539470547324705 query73 663 620 348 348 query74 8862917489408940 query75 3532321526882688 query76 34131183741 741 query77 795 379 295 295 query78 10016 10240 93259325 query79 1822808 588 588 query80 640 527 457 457 query81 486 257 223 223 query82 189 123 95 95 query83 257 251 241 241 query84 306 105 89 89 query85 781 362 317 317 query86 384 294 294 294 query87 4384448343674367 query88 2960231423072307 query89 390 313 286 286 query90 1931222 208 208 query91 140 142 114 114 query92 82 63 57 57 query93 2042962 573 573 query94 649 424 308 308 query95 377 295 284 284 query96 496 574 291 291 query97 2656279326352635 query98 235 208 213 208 query99 1309140112951295 Total cold run time: 272313 ms Total hot run time: 185851 ms ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
doris-robot commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2897163403 TPC-H: Total hot run time: 33957 ms ``` machine: 'aliyun_ecs.c7a.8xlarge_32C64G' scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools Tpch sf100 test result on commit 676c37fb719d9fd8e06a278641bb66f3d17b590f, data reload: false -- Round 1 -- q1 25985 509750145014 q2 2068271 180 180 q3 10646 1277704 704 q4 10245 1058526 526 q5 7932238423852384 q6 184 163 132 132 q7 923 758 620 620 q8 9329133210471047 q9 6794513951685139 q10 6845232919051905 q11 482 282 276 276 q12 346 352 215 215 q13 17793 367830523052 q14 228 221 210 210 q15 534 493 509 493 q16 422 442 378 378 q17 635 880 372 372 q18 7888714970997099 q19 1360966 595 595 q20 352 349 230 230 q21 4296324524122412 q22 10701054974 974 Total cold run time: 116357 ms Total hot run time: 33957 ms - Round 2, with runtime_filter_mode=off - q1 5103514350815081 q2 228 321 229 229 q3 2160267122712271 q4 1326181814891489 q5 4587447943184318 q6 217 166 127 127 q7 1984185917581758 q8 2595257925642564 q9 7173711372147113 q10 3015316827452745 q11 576 516 500 500 q12 729 804 606 606 q13 3501389833133313 q14 285 288 269 269 q15 548 494 489 489 q16 426 510 454 454 q17 1155153113701370 q18 7694759875467546 q19 816 832 1021832 q20 1954205318161816 q21 4803435242494249 q22 1084102510171017 Total cold run time: 51959 ms Total hot run time: 50156 ms ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
hello-stephen commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2896898765 # Cloud UT Coverage Report Increment line coverage ` ` :tada: [Increment coverage report](http://coverage.selectdb-in.cc/coverage/676c37fb719d9fd8e06a278641bb66f3d17b590f_676c37fb719d9fd8e06a278641bb66f3d17b590f_cloud/increment_report/index.html) [Complete coverage report](http://coverage.selectdb-in.cc/coverage/676c37fb719d9fd8e06a278641bb66f3d17b590f_676c37fb719d9fd8e06a278641bb66f3d17b590f_cloud/report/index.html) | Category | Coverage | |---|| | Function Coverage | 83.25% (1113/1337) | | Line Coverage | 66.17% (18674/28220) | | Region Coverage | 65.94% (9283/14077) | | Branch Coverage | 55.56% (4980/8964) | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
wyxxxcat commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2896835082 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
doris-robot commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2894254000 ClickBench: Total hot run time: 28.52 s ``` machine: 'aliyun_ecs.c7a.8xlarge_32C64G' scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools ClickBench test result on commit c0472c79d10037f66e33aacedc6f1313f96ac2a8, data reload: false query1 0.040.030.04 query2 0.120.100.11 query3 0.250.200.19 query4 1.590.190.19 query5 0.420.400.43 query6 1.170.670.66 query7 0.020.020.02 query8 0.040.030.04 query9 0.570.510.53 query10 0.560.590.57 query11 0.150.120.11 query12 0.150.110.12 query13 0.620.600.60 query14 0.770.820.81 query15 0.880.850.85 query16 0.380.380.38 query17 1.071.001.02 query18 0.210.210.21 query19 1.941.831.89 query20 0.010.020.01 query21 15.41 0.900.54 query22 0.741.040.65 query23 15.12 1.380.59 query24 7.951.100.49 query25 0.510.190.06 query26 0.620.160.14 query27 0.050.040.05 query28 9.120.880.45 query29 12.55 3.973.29 query30 0.240.090.07 query31 2.820.590.37 query32 3.240.550.47 query33 3.003.043.10 query34 15.72 5.114.47 query35 4.514.494.46 query36 0.680.500.48 query37 0.080.060.07 query38 0.050.030.03 query39 0.030.030.02 query40 0.160.140.12 query41 0.080.030.02 query42 0.030.020.02 query43 0.040.030.03 Total cold run time: 103.71 s Total hot run time: 28.52 s ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
zhannngchen commented on code in PR #51001: URL: https://github.com/apache/doris/pull/51001#discussion_r2097848710 ## be/test/olap/delta_writer_test.cpp: ## @@ -1047,4 +1050,223 @@ TEST_F(TestDeltaWriter, vec_sequence_col_concurrent_write) { res = engine_ref->tablet_manager()->drop_tablet(request.tablet_id, request.replica_id, false); ASSERT_TRUE(res.ok()); } + +TEST_F(TestDeltaWriter, write_statistics_segment_meta_pb) { +std::unique_ptr profile; +profile = std::make_unique("CreateTablet"); +TCreateTabletReq request; +create_tablet_request(10009, 270068380, &request); +Status res = engine_ref->create_tablet(request, profile.get()); +EXPECT_EQ(Status::OK(), res); + +TDescriptorTable tdesc_tbl = create_descriptor_tablet(); +ObjectPool obj_pool; +DescriptorTbl* desc_tbl = nullptr; +static_cast(DescriptorTbl::create(&obj_pool, tdesc_tbl, &desc_tbl)); +TupleDescriptor* tuple_desc = desc_tbl->get_tuple_descriptor(0); +auto param = std::make_shared(); + +PUniqueId load_id; +load_id.set_hi(0); +load_id.set_lo(0); +WriteRequest write_req; +write_req.tablet_id = 10009; +write_req.schema_hash = 270068380; +write_req.txn_id = 20002; +write_req.partition_id = 30003; +write_req.load_id = load_id; +write_req.tuple_desc = tuple_desc; +write_req.slots = &(tuple_desc->slots()); +write_req.is_high_priority = true; +write_req.table_schema_param = param; + +// test vec delta writer +profile = std::make_unique("LoadChannels"); +auto delta_writer = +std::make_unique(*engine_ref, write_req, profile.get(), TUniqueId {}); +EXPECT_NE(delta_writer, nullptr); + +vectorized::Block block; +for (const auto& slot_desc : tuple_desc->slots()) { + block.insert(vectorized::ColumnWithTypeAndName(slot_desc->get_empty_mutable_column(), + slot_desc->type(), slot_desc->col_name())); +} + +auto columns = block.mutate_columns(); +{ +int8_t k1 = -127; +columns[0]->insert_data((const char*)&k1, sizeof(k1)); + +int16_t k2 = -32767; +columns[1]->insert_data((const char*)&k2, sizeof(k2)); + +int32_t k3 = -2147483647; +columns[2]->insert_data((const char*)&k3, sizeof(k3)); + +int64_t k4 = -9223372036854775807L; +columns[3]->insert_data((const char*)&k4, sizeof(k4)); + +int128_t k5 = -9; +columns[4]->insert_data((const char*)&k5, sizeof(k5)); + +VecDateTimeValue k6; +k6.from_date_str("2048-11-10", 10); +auto k6_int = k6.to_int64(); +columns[5]->insert_data((const char*)&k6_int, sizeof(k6_int)); + +VecDateTimeValue k7; +k7.from_date_str("2636-08-16 19:39:43", 19); +auto k7_int = k7.to_int64(); +columns[6]->insert_data((const char*)&k7_int, sizeof(k7_int)); + +columns[7]->insert_data("abcd", 4); +columns[8]->insert_data("abcde", 5); + +DecimalV2Value decimal_value; +decimal_value.assign_from_double(1.1); +columns[9]->insert_data((const char*)&decimal_value, sizeof(decimal_value)); + +DateV2Value date_v2; +date_v2.from_date_str("2048-11-10", 10); +auto date_v2_int = date_v2.to_date_int_val(); +columns[10]->insert_data((const char*)&date_v2_int, sizeof(date_v2_int)); + +int8_t v1 = -127; +columns[11]->insert_data((const char*)&v1, sizeof(v1)); + +int16_t v2 = -32767; +columns[12]->insert_data((const char*)&v2, sizeof(v2)); + +int32_t v3 = -2147483647; +columns[13]->insert_data((const char*)&v3, sizeof(v3)); + +int64_t v4 = -9223372036854775807L; +columns[14]->insert_data((const char*)&v4, sizeof(v4)); + +int128_t v5 = -9; +columns[15]->insert_data((const char*)&v5, sizeof(v5)); + +VecDateTimeValue v6; +v6.from_date_str("2048-11-10", 10); +auto v6_int = v6.to_int64(); +columns[16]->insert_data((const char*)&v6_int, sizeof(v6_int)); + +VecDateTimeValue v7; +v7.from_date_str("2636-08-16 19:39:43", 19); +auto v7_int = v7.to_int64(); +columns[17]->insert_data((const char*)&v7_int, sizeof(v7_int)); + +columns[18]->insert_data("abcd", 4); +columns[19]->insert_data("abcde", 5); + +decimal_value.assign_from_double(1.1); +columns[20]->insert_data((const char*)&decimal_value, sizeof(decimal_value)); + +date_v2.from_date_str("2048-11-10", 10); +date_v2_int = date_v2.to_date_int_val(); +columns[21]->insert_data((const char*)&date_v2_int, sizeof(date_v2_int)); + +res = delta_writer->write(&block, {0}); Review Comment: you should test write block for multiple times, current implementation will get wrong result -- This is an automated message from the Apache Git Service. To respond to the message, p
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
doris-robot commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2894206040 TPC-H: Total hot run time: 33983 ms ``` machine: 'aliyun_ecs.c7a.8xlarge_32C64G' scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools Tpch sf100 test result on commit c0472c79d10037f66e33aacedc6f1313f96ac2a8, data reload: false -- Round 1 -- q1 26175 506250035003 q2 2089311 203 203 q3 10563 1261715 715 q4 10247 996 511 511 q5 7966242323392339 q6 189 169 140 140 q7 926 744 629 629 q8 9332128211511151 q9 6876510151175101 q10 6877233718901890 q11 491 290 272 272 q12 355 352 218 218 q13 17779 362331053105 q14 232 227 210 210 q15 536 490 470 470 q16 427 443 384 384 q17 625 878 368 368 q18 7824722371437143 q19 1800951 549 549 q20 331 348 225 225 q21 3961260323752375 q22 1057992 982 982 Total cold run time: 116658 ms Total hot run time: 33983 ms - Round 2, with runtime_filter_mode=off - q1 5199527451325132 q2 238 331 227 227 q3 2160266323002300 q4 1340177514101410 q5 4614447343984398 q6 219 173 128 128 q7 1952193017091709 q8 2607265826012601 q9 7172716371617161 q10 2980315027542754 q11 573 497 517 497 q12 684 762 645 645 q13 3503382532863286 q14 277 298 272 272 q15 517 477 476 476 q16 458 475 429 429 q17 1150147814191419 q18 7792747073997399 q19 868 886 1052886 q20 1969201019011901 q21 4730437342994299 q22 1078997 987 987 Total cold run time: 52080 ms Total hot run time: 50316 ms ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
zhannngchen commented on code in PR #51001: URL: https://github.com/apache/doris/pull/51001#discussion_r2097848710 ## be/test/olap/delta_writer_test.cpp: ## @@ -1047,4 +1050,223 @@ TEST_F(TestDeltaWriter, vec_sequence_col_concurrent_write) { res = engine_ref->tablet_manager()->drop_tablet(request.tablet_id, request.replica_id, false); ASSERT_TRUE(res.ok()); } + +TEST_F(TestDeltaWriter, write_statistics_segment_meta_pb) { +std::unique_ptr profile; +profile = std::make_unique("CreateTablet"); +TCreateTabletReq request; +create_tablet_request(10009, 270068380, &request); +Status res = engine_ref->create_tablet(request, profile.get()); +EXPECT_EQ(Status::OK(), res); + +TDescriptorTable tdesc_tbl = create_descriptor_tablet(); +ObjectPool obj_pool; +DescriptorTbl* desc_tbl = nullptr; +static_cast(DescriptorTbl::create(&obj_pool, tdesc_tbl, &desc_tbl)); +TupleDescriptor* tuple_desc = desc_tbl->get_tuple_descriptor(0); +auto param = std::make_shared(); + +PUniqueId load_id; +load_id.set_hi(0); +load_id.set_lo(0); +WriteRequest write_req; +write_req.tablet_id = 10009; +write_req.schema_hash = 270068380; +write_req.txn_id = 20002; +write_req.partition_id = 30003; +write_req.load_id = load_id; +write_req.tuple_desc = tuple_desc; +write_req.slots = &(tuple_desc->slots()); +write_req.is_high_priority = true; +write_req.table_schema_param = param; + +// test vec delta writer +profile = std::make_unique("LoadChannels"); +auto delta_writer = +std::make_unique(*engine_ref, write_req, profile.get(), TUniqueId {}); +EXPECT_NE(delta_writer, nullptr); + +vectorized::Block block; +for (const auto& slot_desc : tuple_desc->slots()) { + block.insert(vectorized::ColumnWithTypeAndName(slot_desc->get_empty_mutable_column(), + slot_desc->type(), slot_desc->col_name())); +} + +auto columns = block.mutate_columns(); +{ +int8_t k1 = -127; +columns[0]->insert_data((const char*)&k1, sizeof(k1)); + +int16_t k2 = -32767; +columns[1]->insert_data((const char*)&k2, sizeof(k2)); + +int32_t k3 = -2147483647; +columns[2]->insert_data((const char*)&k3, sizeof(k3)); + +int64_t k4 = -9223372036854775807L; +columns[3]->insert_data((const char*)&k4, sizeof(k4)); + +int128_t k5 = -9; +columns[4]->insert_data((const char*)&k5, sizeof(k5)); + +VecDateTimeValue k6; +k6.from_date_str("2048-11-10", 10); +auto k6_int = k6.to_int64(); +columns[5]->insert_data((const char*)&k6_int, sizeof(k6_int)); + +VecDateTimeValue k7; +k7.from_date_str("2636-08-16 19:39:43", 19); +auto k7_int = k7.to_int64(); +columns[6]->insert_data((const char*)&k7_int, sizeof(k7_int)); + +columns[7]->insert_data("abcd", 4); +columns[8]->insert_data("abcde", 5); + +DecimalV2Value decimal_value; +decimal_value.assign_from_double(1.1); +columns[9]->insert_data((const char*)&decimal_value, sizeof(decimal_value)); + +DateV2Value date_v2; +date_v2.from_date_str("2048-11-10", 10); +auto date_v2_int = date_v2.to_date_int_val(); +columns[10]->insert_data((const char*)&date_v2_int, sizeof(date_v2_int)); + +int8_t v1 = -127; +columns[11]->insert_data((const char*)&v1, sizeof(v1)); + +int16_t v2 = -32767; +columns[12]->insert_data((const char*)&v2, sizeof(v2)); + +int32_t v3 = -2147483647; +columns[13]->insert_data((const char*)&v3, sizeof(v3)); + +int64_t v4 = -9223372036854775807L; +columns[14]->insert_data((const char*)&v4, sizeof(v4)); + +int128_t v5 = -9; +columns[15]->insert_data((const char*)&v5, sizeof(v5)); + +VecDateTimeValue v6; +v6.from_date_str("2048-11-10", 10); +auto v6_int = v6.to_int64(); +columns[16]->insert_data((const char*)&v6_int, sizeof(v6_int)); + +VecDateTimeValue v7; +v7.from_date_str("2636-08-16 19:39:43", 19); +auto v7_int = v7.to_int64(); +columns[17]->insert_data((const char*)&v7_int, sizeof(v7_int)); + +columns[18]->insert_data("abcd", 4); +columns[19]->insert_data("abcde", 5); + +decimal_value.assign_from_double(1.1); +columns[20]->insert_data((const char*)&decimal_value, sizeof(decimal_value)); + +date_v2.from_date_str("2048-11-10", 10); +date_v2_int = date_v2.to_date_int_val(); +columns[21]->insert_data((const char*)&date_v2_int, sizeof(date_v2_int)); + +res = delta_writer->write(&block, {0}); Review Comment: you should test write block for multiple times, currently implementation will get wrong result -- This is an automated message from the Apache Git Service. To respond to the message,
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
doris-robot commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2894239662 TPC-DS: Total hot run time: 186009 ms ``` machine: 'aliyun_ecs.c7a.8xlarge_32C64G' scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools TPC-DS sf100 test result on commit c0472c79d10037f66e33aacedc6f1313f96ac2a8, data reload: false query1 1004492 498 492 query2 6561184918131813 query3 6753227 221 221 query4 26706 23449 23371 23371 query5 4376626 453 453 query6 293 199 228 199 query7 4632509 309 309 query8 282 238 223 223 query9 8605261726432617 query10 481 329 267 267 query11 15893 14997 14874 14874 query12 162 117 104 104 query13 1664555 422 422 query14 8865612762396127 query15 207 190 175 175 query16 7132641 474 474 query17 1203733 590 590 query18 1982397 306 306 query19 199 199 166 166 query20 118 117 125 117 query21 211 134 111 111 query22 4077412339563956 query23 34049 33043 32923 32923 query24 8480240024252400 query25 569 533 403 403 query26 1228272 153 153 query27 2753503 344 344 query28 4310213021272127 query29 774 572 434 434 query30 299 204 184 184 query31 896 855 814 814 query32 81 68 61 61 query33 570 380 318 318 query34 833 866 516 516 query35 794 816 720 720 query36 977 991 884 884 query37 106 100 82 82 query38 4033416540594059 query39 1479141215801412 query40 215 119 106 106 query41 58 52 54 52 query42 140 113 110 110 query43 495 497 470 470 query44 1336852 840 840 query45 169 177 171 171 query46 845 1031640 640 query47 1752179817371737 query48 391 422 314 314 query49 792 521 429 429 query50 649 668 405 405 query51 4103413441084108 query52 121 109 103 103 query53 222 255 190 190 query54 582 568 508 508 query55 91 85 82 82 query56 303 321 275 275 query57 11441131 query58 260 255 260 255 query59 2618262725932593 query60 334 325 307 307 query61 125 126 122 122 query62 788 756 685 685 query63 227 196 192 192 query64 4303976 650 650 query65 4334425842694258 query66 1152422 316 316 query67 15923 15511 15363 15363 query68 8330892 541 541 query69 470 357 274 274 query70 1255105911201059 query71 460 329 311 311 query72 5360472948474729 query73 738 659 360 360 query74 9074881588288815 query75 3837322126742674 query76 36581180763 763 query77 786 371 295 295 query78 10215 10163 93379337 query79 2369820 583 583 query80 621 516 456 456 query81 483 262 220 220 query82 494 127 96 96 query83 287 249 227 227 query84 295 109 87 87 query85 779 356 320 320 query86 385 378 288 288 query87 4512444043114311 query88 3253235823562356 query89 393 324 291 291 query90 1908212 211 211 query91 137 142 109 109 query92 78 58 59 58 query93 1721973 591 591 query94 669 405 306 306 query95 377 353 284 284 query96 506 587 290 290 query97 2773276226592659 query98 242 227 212 212 query99 1484139912831283 Total cold run time: 275058 ms Total hot run time: 186009 ms ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
zhannngchen commented on code in PR #51001: URL: https://github.com/apache/doris/pull/51001#discussion_r2097831686 ## be/src/olap/rowset/segment_v2/segment_writer.cpp: ## @@ -811,7 +813,18 @@ Status SegmentWriter::append_block(const vectorized::Block* block, size_t row_po } RETURN_IF_ERROR(_column_writers[id]->append(converted_result.second->get_nullmap(), converted_result.second->get_data(), num_rows)); + +// estimate column data size for flush memtable, may be inaccurate at low cardinality +column_data_size += _column_writers[id]->estimate_buffer_size(); +total_data_size += column_data_size; +auto origin_column_data_size = _footer.columns(id).total_data_size(); + _footer.mutable_columns(id)->set_total_data_size(origin_column_data_size + + column_data_size); } + +auto origin_data_footprint = _footer.data_footprint(); +_footer.set_data_footprint(origin_data_footprint + total_data_size); Review Comment: you don't need to accumulate these stats in each `append_block`, you can get the final stats once in the `finalize` method -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
hello-stephen commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2893806170 # BE Regression && UT Coverage Report Increment line coverage `100.00% (20/20)` :tada: [Increment coverage report](http://coverage.selectdb-in.cc/coverage/51001_46aefdf676b089c7a32d30db93252cf71fefda34_merge/increment_report/index.html) [Complete coverage report](http://coverage.selectdb-in.cc/coverage/51001_46aefdf676b089c7a32d30db93252cf71fefda34_merge/report/index.html) | Category | Coverage | |---|| | Function Coverage | 79.39% (20847/26258) | | Line Coverage | 72.59% (214712/295770) | | Region Coverage | 70.76% (126262/178430) | | Branch Coverage | 64.51% (65421/101412) | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
doris-robot commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2893881356 # Cloud UT Coverage Report Increment line coverage ` ` :tada: [Increment coverage report](http://coverage.selectdb-in.cc/coverage/c0472c79d10037f66e33aacedc6f1313f96ac2a8_c0472c79d10037f66e33aacedc6f1313f96ac2a8_cloud/increment_report/index.html) [Complete coverage report](http://coverage.selectdb-in.cc/coverage/c0472c79d10037f66e33aacedc6f1313f96ac2a8_c0472c79d10037f66e33aacedc6f1313f96ac2a8_cloud/report/index.html) | Category | Coverage | |---|| | Function Coverage | 83.31% (1113/1336) | | Line Coverage | 66.25% (18674/28187) | | Region Coverage | 65.85% (9269/14076) | | Branch Coverage | 55.61% (4985/8964) | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
wyxxxcat commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2893834007 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
wyxxxcat commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2893090879 run cloud_p0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
wyxxxcat commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2893363088 run p0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
wyxxxcat commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2893363587 run cloud_p0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
wyxxxcat commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2893090714 run p0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
doris-robot commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2893024791 # BE UT Coverage Report Increment line coverage `100.00% (20/20)` :tada: [Increment coverage report](http://coverage.selectdb-in.cc/coverage/46aefdf676b089c7a32d30db93252cf71fefda34_46aefdf676b089c7a32d30db93252cf71fefda34/increment_report/index.html) [Complete coverage report](http://coverage.selectdb-in.cc/coverage/46aefdf676b089c7a32d30db93252cf71fefda34_46aefdf676b089c7a32d30db93252cf71fefda34/report/index.html) | Category | Coverage | |---|| | Function Coverage | 55.95% (14924/26674) | | Line Coverage | 44.77% (132423/295773) | | Region Coverage | 43.86% (66615/151878) | | Branch Coverage | 38.45% (34133/88778) | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
doris-robot commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2892916922 ClickBench: Total hot run time: 29.74 s ``` machine: 'aliyun_ecs.c7a.8xlarge_32C64G' scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools ClickBench test result on commit 46aefdf676b089c7a32d30db93252cf71fefda34, data reload: false query1 0.040.030.03 query2 0.150.100.11 query3 0.350.210.20 query4 1.590.210.09 query5 0.430.420.42 query6 1.170.670.66 query7 0.020.020.02 query8 0.050.050.04 query9 0.620.520.51 query10 0.570.580.57 query11 0.260.130.14 query12 0.250.140.13 query13 0.640.620.62 query14 0.800.820.84 query15 0.980.890.88 query16 0.380.380.38 query17 1.071.061.06 query18 0.180.180.17 query19 1.931.791.80 query20 0.010.020.01 query21 15.40 0.970.67 query22 0.931.060.84 query23 14.70 1.530.78 query24 5.520.550.28 query25 0.170.090.08 query26 0.550.220.19 query27 0.090.080.08 query28 11.00 1.200.58 query29 12.52 4.143.46 query30 0.280.090.06 query31 2.800.620.43 query32 3.230.590.50 query33 3.103.133.10 query34 16.46 5.104.44 query35 4.494.464.50 query36 0.640.510.51 query37 0.210.180.17 query38 0.170.150.15 query39 0.050.050.04 query40 0.220.160.15 query41 0.110.050.06 query42 0.060.050.05 query43 0.050.060.04 Total cold run time: 104.24 s Total hot run time: 29.74 s ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
doris-robot commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2892910422 TPC-DS: Total hot run time: 193411 ms ``` machine: 'aliyun_ecs.c7a.8xlarge_32C64G' scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools TPC-DS sf100 test result on commit 46aefdf676b089c7a32d30db93252cf71fefda34, data reload: false query1 1418109510581058 query2 6383180817831783 query3 11050 456343494349 query4 52462 24876 23402 23402 query5 5020546 448 448 query6 338 202 195 195 query7 4869506 301 301 query8 298 243 228 228 query9 5520265826352635 query10 422 328 273 273 query11 15084 15050 15179 15050 query12 156 115 102 102 query13 1027522 400 400 query14 10176 628863906288 query15 225 203 178 178 query16 7107645 507 507 query17 1110795 616 616 query18 1582418 324 324 query19 207 194 182 182 query20 139 135 121 121 query21 216 133 115 115 query22 4386440842654265 query23 34590 33716 33531 33531 query24 6517244524182418 query25 481 491 408 408 query26 685 279 161 161 query27 2325519 343 343 query28 3158215321562153 query29 596 614 443 443 query30 275 224 190 190 query31 882 887 788 788 query32 76 71 68 68 query33 450 366 334 334 query34 781 881 569 569 query35 798 865 753 753 query36 974 999 902 902 query37 109 98 76 76 query38 4224435242644264 query39 1548150414531453 query40 211 126 111 111 query41 58 58 55 55 query42 129 117 120 117 query43 495 496 491 491 query44 1414874 866 866 query45 205 177 182 177 query46 858 1035660 660 query47 1829188118261826 query48 403 446 350 350 query49 715 505 443 443 query50 688 701 416 416 query51 4227432341384138 query52 119 120 103 103 query53 240 270 193 193 query54 583 593 524 524 query55 94 92 88 88 query56 317 332 321 321 query57 1175123411621162 query58 282 281 269 269 query59 2656281126212621 query60 372 345 345 345 query61 168 150 144 144 query62 776 767 705 705 query63 239 202 201 201 query64 15211054672 672 query65 4325424142144214 query66 723 405 302 302 query67 15963 15682 15662 15662 query68 7257903 530 530 query69 559 331 284 284 query70 1168112611211121 query71 507 331 304 304 query72 5503470646444644 query73 1456592 357 357 query74 8972895090848950 query75 3814318026722672 query76 42321210770 770 query77 615 390 299 299 query78 10106 10149 93479347 query79 2181836 577 577 query80 591 510 533 510 query81 485 251 222 222 query82 438 127 101 101 query83 254 252 236 236 query84 307 107 88 88 query85 801 351 309 309 query86 392 292 267 267 query87 4495448444344434 query88 3543233623232323 query89 414 323 292 292 query90 1824210 205 205 query91 139 146 114 114 query92 72 61 59 59 query93 1731963 596 596 query94 675 415 305 305 query95 373 304 281 281 query96 503 581 284 284 query97 2772277326082608 query98 228 235 202 202 query99 1526138212701270 Total cold run time: 296382 ms Total hot run time: 193411 ms ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
doris-robot commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2892895357 TPC-H: Total hot run time: 34182 ms ``` machine: 'aliyun_ecs.c7a.8xlarge_32C64G' scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools Tpch sf100 test result on commit 46aefdf676b089c7a32d30db93252cf71fefda34, data reload: false -- Round 1 -- q1 26665 506250325032 q2 2093289 188 188 q3 10499 1286733 733 q4 10239 1039548 548 q5 7627233123632331 q6 191 167 138 138 q7 958 750 632 632 q8 9333130611261126 q9 6750509651085096 q10 6915235119221922 q11 491 293 279 279 q12 360 362 221 221 q13 17766 373531393139 q14 242 223 215 215 q15 548 480 484 480 q16 436 434 384 384 q17 598 850 379 379 q18 7861717671697169 q19 1826972 564 564 q20 344 334 221 221 q21 4153319724052405 q22 10731016980 980 Total cold run time: 116968 ms Total hot run time: 34182 ms - Round 2, with runtime_filter_mode=off - q1 5234518451035103 q2 246 336 232 232 q3 2183268923242324 q4 1381187914671467 q5 4483450044134413 q6 221 173 126 126 q7 2037196017781778 q8 2611256426192564 q9 7300700371457003 q10 3047326427812781 q11 576 516 511 511 q12 700 767 626 626 q13 3585391733853385 q14 277 307 271 271 q15 546 517 497 497 q16 443 484 445 445 q17 1176149414221422 q18 7863772874367436 q19 846 891 928 891 q20 1958197718581858 q21 4957453045024502 q22 1086105910461046 Total cold run time: 52756 ms Total hot run time: 50681 ms ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
hello-stephen commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2892803230 # Cloud UT Coverage Report Increment line coverage ` ` :tada: [Increment coverage report](http://coverage.selectdb-in.cc/coverage/46aefdf676b089c7a32d30db93252cf71fefda34_46aefdf676b089c7a32d30db93252cf71fefda34_cloud/increment_report/index.html) [Complete coverage report](http://coverage.selectdb-in.cc/coverage/46aefdf676b089c7a32d30db93252cf71fefda34_46aefdf676b089c7a32d30db93252cf71fefda34_cloud/report/index.html) | Category | Coverage | |---|| | Function Coverage | 83.31% (1113/1336) | | Line Coverage | 66.28% (18681/28187) | | Region Coverage | 65.88% (9273/14076) | | Branch Coverage | 55.70% (4993/8964) | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [chore](info) Record `data_footprint` and `total_data_size` in SegmentFooterPB and ColumnMetaPB [doris]
wyxxxcat commented on PR #51001: URL: https://github.com/apache/doris/pull/51001#issuecomment-2892780789 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org