Surya Hebbar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23950 )

Change subject: IMPALA-12191: Add hardware and OS details to runtime profile
......................................................................


Patch Set 1:

(8 comments)

The specific CPU and OS info string should be added to the respective per node 
profile. This is the direct way to identify and corelate performance issues or 
other problems with OS or CPU, otherwise we cannot quickly identify which OS 
belongs to which host. Later we can aggregate as we previously did.

It should look something like this...

    Per Node Profiles:
      surya-Precision-5560:27000:
        OS Info: Ubuntu 22.04.5 LTS
        CPU Info: Intel(R) Xeon(R) Silver 4215R CPU @ 3.20GHz (32 cores)
        Filter 0 arrival: 606ms
        Filter 2 arrival: 606ms
         - HostCpuIoWaitPercentage (50.000ms): 0...
         - HostCpuSysPercentage (50.000ms): 0, 0, 2...
         - HostCpuUserPercentage (50.000ms): 2, 2, 2...
      surya-Precision-5560:27001:
        OS Info: Ubuntu 20.04.5 LTS
        CPU Info: Intel(R) Xeon(R) Silver 4215R CPU @ 3.20GHz (32 cores)
        Filter 0 arrival: 100ms
        Filter 2 arrival: 606ms
         - HostCpuIoWaitPercentage (50.000ms): 0...
         - HostCpuSysPercentage (50.000ms): 0, 0, 2...
         - HostCpuUserPercentage (50.000ms): 2, 2, 2...
      surya-Precision-5560:27002:
        OS Info: Ubuntu 22.04.5 LTS
        CPU Info: Intel(R) Xeon(R) Silver 4215R CPU @ 3.20GHz (32 cores)
        Filter 0 arrival: 506ms
        Filter 2 arrival: 606ms
         - HostCpuIoWaitPercentage (50.000ms): 0...
         - HostCpuSysPercentage (50.000ms): 0, 0, 2...
         - HostCpuUserPercentage (50.000ms): 2, 2, 2...

Other than this, the patchset looks good.

Thanks for working on it.

http://gerrit.cloudera.org:8080/#/c/23950/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/23950/1//COMMIT_MSG@12
PS1, Line 12: in a concise format.
Add the example output in the commit message.


http://gerrit.cloudera.org:8080/#/c/23950/1//COMMIT_MSG@20
PS1, Line 20: - Manually verified the output in runtime profiles
You need to update the profiles in testdata/impala-profiles for the 
impala-profile-tool.

Maybe if the thrift is optional, the output may not change.

In that case, it's okay to leave as it is.


http://gerrit.cloudera.org:8080/#/c/23950/1//COMMIT_MSG@22
PS1, Line 22: TODO (for future patches):
            : - Add cache sizes (L1, L2, L3) information
            : - Add NUMA node details
            : - Add memory capacity and type information
            : - More detailed CPU information (clock speed,
            :   CPU flags like AVX, AVX2, etc.)
TODOs maybe better to put in JIRAs or comments within the codebase at 
appropriate places, they should not be put in commit messages.

Commit messages are permanent and should not contain TODOs that maybe part of 
future changes, unless its absolutely necessary.

Please remove this.


http://gerrit.cloudera.org:8080/#/c/23950/1/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/23950/1/be/src/runtime/coordinator.cc@1447
PS1, Line 1447:   // TODO(IMPALA-8126): Move to host profiles
This TODO seems to mention that we should move this to per node profiles if 
possible.

Please take a look at this, whether this can be done now.


http://gerrit.cloudera.org:8080/#/c/23950/1/be/src/runtime/coordinator.cc@1453
PS1, Line 1453: // Add hardware and OS information
As Joe had mentioned in the JIRA, it's definitely worth aggregating here for 
easier summary.

But we cannot know which Host/Coordinator/Executor is which OS or CPU version 
easily. We would need to go to each executor's root page like before and check.

The proper way is to add the specific OS version to each per node profile, and 
then aggregate here like you are doing now.


http://gerrit.cloudera.org:8080/#/c/23950/1/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/23950/1/tests/query_test/test_observability.py@1155
PS1, Line 1155: "Could not extract Backend CPU Info from profile"
You should print the entire profile in the error message as well.

Like it was done before...

"Could not extract Backend CPU Info from profile: \n" + profile


http://gerrit.cloudera.org:8080/#/c/23950/1/tests/query_test/test_observability.py@1168
PS1, Line 1168: "Could not extract Backend OS Info from profile"
Same here.


http://gerrit.cloudera.org:8080/#/c/23950/1/tests/query_test/test_observability.py@1171
PS1, Line 1171: # Validate format: Should contain an OS name and a count in 
parentheses
              :     # Example: "Ubuntu 22.04.5 LTS (1)"
Write example of where multiple OS and multiple types of CPU info are present 
as well.

Also see if you can test it out.

If you cannot write a test for it, see if it works as expected in such an 
environment, or manually simulate such a case through hardcoded cases.



--
To view, visit http://gerrit.cloudera.org:8080/23950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib5a093b6cf9dd7dfa4321b79aeea95d6fc748dd4
Gerrit-Change-Number: 23950
Gerrit-PatchSet: 1
Gerrit-Owner: Arnab Karmakar <[email protected]>
Gerrit-Reviewer: Arnab Karmakar <[email protected]>
Gerrit-Reviewer: David Rorke <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Surya Hebbar <[email protected]>
Gerrit-Comment-Date: Tue, 24 Feb 2026 10:41:16 +0000
Gerrit-HasComments: Yes

Reply via email to