wenzhenghu opened a new pull request, #63130:
URL: https://github.com/apache/doris/pull/63130

   checkpick from https://github.com/apache/doris/pull/60567
   **PR Summary**
   - This PR unifies current-query runtime statistics onto the `BE -> FE` 
reporting pipeline, replacing the previous ad-hoc `RuntimeProfile` traversal 
path, and enriches `current_queries` with task-level progress plus broader 
resource metrics.
   - The goal is to make current-query visibility more real-time and consistent 
with audit statistics while simplifying and consolidating FE proc/REST surfaces.
   
   **What It Solves**
   - Unifies statistics source: `QeProcessorImpl` now reads aggregated 
`TQueryStatistics` from `WorkloadRuntimeStatusMgr` instead of relying on the 
legacy `CurrentQueryInfoProvider` path.
   - Improves progress observability: introduces `process_rows`, 
`total_tasks_num`, and `finished_tasks_num`, and exposes computed `Progress`.
   - Expands runtime metrics coverage: `current_queries` now includes richer 
scan/cpu/memory/shuffle/spill/cache counters.
   - Consolidates query views: `/current_queries` and `/current_query_stmts` 
now share the same statistics view; legacy per-query/per-fragment proc 
drill-down implementation is removed.
   
   **Implementation Details**
   - Protocol layer:
   - Extends `TQueryStatistics` with `process_rows`, `finished_tasks_num`, and 
`total_tasks_num`.
   - BE collection/reporting:
     - Accumulates `process_rows` in the execution path.
   - Records `total_tasks_num` at pipeline task graph initialization and 
increments `finished_tasks_num` in real time when tasks close.
   - Mirrors task-progress counters into `QueryTaskController` so counters 
remain available even after `QueryContext` teardown.
     - Exports new fields in `ResourceContext::to_thrift_query_statistics`.
   - FE aggregation/retention:
   - `WorkloadRuntimeStatusMgr` merges additional fields (including task 
progress) and refines timeout cleanup: remove query stats only when they are 
timed out and the query no longer exists in FE.
   - `QueryStatisticsItem` now carries `TQueryStatistics` as the unified data 
carrier for proc/REST.
   - Presentation layer:
   - `CurrentQueryStatisticsProcDir` adds expanded columns and computes 
`Progress`.
   - `/rest/v2/manager/query/current_queries` in `QueryProfileAction` now 
serves the same unified stats view.
   - Removes legacy classes: `CurrentQueryInfoProvider`, 
`CurrentQuerySqlProcDir`, `CurrentQueryFragmentProcNode`, and 
`CurrentQueryStatementsProcNode`.
   
   ```
   *************************** 1. row ***************************
                          QueryId: e00b00b1155d4042-98862b60016a768a
                     ConnectionId: 394
                          Catalog: internal
                         Database: wzhtest
                             User: root
                         ExecTime: 20717
                          SqlHash: cf263b08302d8be436c97dd5e6f0d283
                        Statement: INSERT INTO test_query_progress_tb   SELECT 
DISTINCT k, CONCAT(v, CAST(k AS STRING))   FROM test_query_progress_tb   WHERE 
k % 2 = 0
                         ScanRows: 45400000 Rows
                        ScanBytes: 2.70 GB
                      ProcessRows: 75598123 Rows
                            CpuMs: 178336
               MaxPeakMemoryBytes: 13.03 GB
           CurrentUsedMemoryBytes: 8.69 GB
                  WorkloadGroupId: 1777125330381
                 ShuffleSendBytes: 0.00
                  ShuffleSendRows: 0 Rows
        ScanBytesFromLocalStorage: 31.48 MB
       ScanBytesFromRemoteStorage: 0.00
    SpillWriteBytesToLocalStorage: 0.00
   SpillReadBytesFromLocalStorage: 0.00
              BytesWriteIntoCache: 0.00
                       TotalTasks: 74
                    FinishedTasks: 51
                         Progress: 68%
   ------------------------
   -- first--
   QueryId: e2b8c99658a94743-9ebbf0d036d83295
     ConnectionId: 9
     Catalog: hive_test
     Database: tpch100_parquet
     User: root
     ExecTime: 6093
     SqlHash: f8a30e4182d72cce3eff6cb385005b1f
     Statement: select ... from supplier, lineitem l1, orders, nation ... limit 
100
     ScanRows: 621466194 Rows
     ScanBytes: 5.37 GB
     ProcessRows: 79079742 Rows
     CpuMs: 31655
     MaxPeakMemoryBytes: 2.32 GB
     CurrentUsedMemoryBytes: 2.18 GB
     WorkloadGroupId: 1777253545394
     ShuffleSendBytes: 0.00
     ShuffleSendRows: 0 Rows
     ScanBytesFromLocalStorage: 0.00
     ScanBytesFromRemoteStorage: 5.37 GB
     SpillWriteBytesToLocalStorage: 0.00
     SpillReadBytesFromLocalStorage: 0.00
     BytesWriteIntoCache: 0.00
     TotalTasks: 138
     FinishedTasks: 49
     Progress: 35%
   --second--
     QueryId: e2b8c99658a94743-9ebbf0d036d83295
     ConnectionId: 9
     Catalog: hive_test
     Database: tpch100_parquet
     User: root
     ExecTime: 10807
     SqlHash: f8a30e4182d72cce3eff6cb385005b1f
     Statement: select ... from supplier, lineitem l1, orders, nation ... limit 
100
     ScanRows: 1102562592 Rows
     ScanBytes: 9.20 GB
     ProcessRows: 112176670 Rows
     CpuMs: 53808
     MaxPeakMemoryBytes: 3.13 GB
     CurrentUsedMemoryBytes: 2.50 GB
     WorkloadGroupId: 1777253545394
     ShuffleSendBytes: 0.00
     ShuffleSendRows: 0 Rows
     ScanBytesFromLocalStorage: 0.00
     ScanBytesFromRemoteStorage: 9.20 GB
     SpillWriteBytesToLocalStorage: 0.00
     SpillReadBytesFromLocalStorage: 0.00
     BytesWriteIntoCache: 0.00
     TotalTasks: 138
     FinishedTasks: 65
     Progress: 47%
   ```
   
   None
   
   - Test <!-- At least one of them must be included. -->
       - [x] Regression test
       - [x] Unit Test
       - [x] Manual test (add detailed scripts or steps below)
       - [ ] No need to test or manual test. Explain why:
   - [ ] This is a refactor/code format and no logic has been changed.
           - [ ] Previous test can cover this change. - [ ] No code files have 
been changed. - [ ] Other reason <!-- Add your reason?  -->
   
   - Behavior changed:
       - [ ] No.
       - [x] Yes. <!-- Explain the behavior change -->
   
   - Does this need documentation?
       - [ ] No.
   - [x] Yes. <!-- Add document PR link here. eg: 
https://github.com/apache/doris-website/pull/1214 -->
   
   - [ ] Confirm the release note
   - [ ] Confirm test cases
   - [ ] Confirm document
   - [ ] Add branch pick label <!-- Add branch pick label that this PR should 
merge into -->
   
   ---------
   
   ### What problem does this PR solve?
   
   Issue Number: close #xxx
   
   Related PR: #xxx
   
   Problem Summary:
   
   ### Release note
   
   None
   
   ### Check List (For Author)
   
   - Test <!-- At least one of them must be included. -->
       - [ ] Regression test
       - [ ] Unit Test
       - [ ] Manual test (add detailed scripts or steps below)
       - [ ] No need to test or manual test. Explain why:
           - [ ] This is a refactor/code format and no logic has been changed.
           - [ ] Previous test can cover this change.
           - [ ] No code files have been changed.
           - [ ] Other reason <!-- Add your reason?  -->
   
   - Behavior changed:
       - [ ] No.
       - [ ] Yes. <!-- Explain the behavior change -->
   
   - Does this need documentation?
       - [ ] No.
       - [ ] Yes. <!-- Add document PR link here. eg: 
https://github.com/apache/doris-website/pull/1214 -->
   
   ### Check List (For Reviewer who merge this PR)
   
   - [ ] Confirm the release note
   - [ ] Confirm test cases
   - [ ] Confirm document
   - [ ] Add branch pick label <!-- Add branch pick label that this PR should 
merge into -->
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to