This is an automated email from the ASF dual-hosted git repository.

yiguolei pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
     new 3c2b2361be [docs](memory) debug-tools memory part description Jemalloc 
#20054
3c2b2361be is described below

commit 3c2b2361be64b1526de56efe1538fba2953bfeba
Author: Xinyi Zou <[email protected]>
AuthorDate: Fri May 26 08:58:57 2023 +0800

    [docs](memory) debug-tools memory part description Jemalloc #20054
---
 docs/en/community/developer-guide/debug-tool.md    | 85 +++++++++++++++++++++-
 docs/zh-CN/community/developer-guide/debug-tool.md | 36 +++++++--
 2 files changed, 111 insertions(+), 10 deletions(-)

diff --git a/docs/en/community/developer-guide/debug-tool.md 
b/docs/en/community/developer-guide/debug-tool.md
index 9d14782ab5..d4eeda8ac1 100644
--- a/docs/en/community/developer-guide/debug-tool.md
+++ b/docs/en/community/developer-guide/debug-tool.md
@@ -102,9 +102,13 @@ Fe is a java process. Here are just a few simple and 
commonly used java debuggin
 
 Debugging memory is generally divided into two aspects. One is whether the 
total amount of memory use is reasonable. On the one hand, the excessive amount 
of memory use may be due to memory leak in the system, on the other hand, it 
may be due to improper use of program memory. The second is whether there is a 
problem of memory overrun and illegal access, such as program access to memory 
with an illegal address, use of uninitialized memory, etc. For the debugging of 
memory, we usually use [...]
 
-#### Log
+Doris 1.2.1 and previous versions use TCMalloc. Doris 1.2.2 starts to use 
Jemalloc by default. Select the memory debugging method according to the Doris 
version used. If you need to switch TCMalloc, you can compile `USE_JEMALLOC=OFF 
sh build.sh --be`.
 
-When we find that the memory usage is too large, we can first check the be.out 
log to see if there is a large memory application. Because of the TCMalloc 
currently used by Doris to manage memory, when a large memory application is 
encountered, the stack of the application will be printed to the be.out file. 
The general form is as follows:
+When we find that the memory usage is too large, we can first check the BE log 
to see if there is a large memory application.
+
+###### TCMalloc
+
+When using TCMalloc, when a large memory application is encountered, the 
application stack will be printed to the be.out file, and the general 
expression is as follows:
 
 ```
 tcmalloc: large alloc 1396277248 bytes == 0x3f3488000 @  0x2af6f63 0x2c4095b 
0x134d278 0x134bdcb 0x133d105 0x133d1d0 0x19930ed
@@ -124,8 +128,24 @@ $ addr2line -e lib/doris_be  0x2af6f63 0x2c4095b 0x134d278 
0x134bdcb 0x133d105 0
 thread.cpp:?
 ```
 
+##### JEMALLOC
+
+Most of Doris's large memory applications use Allocator, such as HashTable and 
data serialization. This part of the memory application is expected and will be 
effectively managed. Other large memory applications are not expected and will 
be applied The stack is printed to the be.INFO file, which is usually used for 
debugging, and the general expression is as follows:
+```
+MemHook alloc large memory: 8.2GB, stacktrace:
+Alloc Stacktrace:
+    @     0x55a6a5cf6b4d  doris::ThreadMemTrackerMgr::consume()
+    @     0x55a6a5cf99bf  malloc
+    @     0x55a6ae0caf98  operator new()
+    @     0x55a6a57cb013  doris::segment_v2::PageIO::read_and_decompress_page()
+    @     0x55a6a57719c0  doris::segment_v2::ColumnReader::read_page()
+    ……
+```
+
 #### HEAP PROFILE
 
+##### TCMalloc
+
 Sometimes the application of memory is not caused by the application of large 
memory, but by the continuous accumulation of small memory. Then there is no 
way to locate the specific application information by viewing the log, so you 
need to get the information through other ways.
 
 At this time, we can take advantage of TCMalloc's 
[heapprofile](https://gperftools.github.io/gperftools/heapprofile.html). If the 
heapprofile function is set, we can get the overall memory application usage of 
the process. The usage is to set the 'heapprofile' environment variable before 
starting Doris be. For example:
@@ -171,7 +191,7 @@ pprof --svg lib/doris_be /tmp/doris_be.hprof.0012.heap > 
heap.svg
 
 **NOTE: turning on this option will affect the execution performance of the 
program. Please be careful to turn on the online instance.**
 
-#### pprof remote server
+###### pprof remote server
 
 Although heapprofile can get all the memory usage information, it has some 
limitations. 1. Restart be. 2. You need to enable this command all the time, 
which will affect the performance of the whole process.
 
@@ -198,6 +218,65 @@ Total: 1296.4 MB
 
 The output of this command is the same as the output and view mode of heap 
profile, which will not be described in detail here. Statistics will be enabled 
only during execution of this command, which has a limited impact on process 
performance compared with heap profile.
 
+##### JEMALLOC
+
+###### 1. runtime heap dump by http
+Add `,prof:true,lg_prof_sample:10` to `JEMALLOC_CONF` in `start_be.sh` and 
restart BE, then use the jemalloc heap dump http interface to generate a heap 
dump file on the corresponding BE machine.
+
+The directory where the heap dump file is located can be configured through 
the ``jeprofile_dir`` variable in ``be.conf``, and the default is 
``${DORIS_HOME}/log``
+
+```shell
+curl http://be_host:be_webport/jeheap/dump
+```
+
+`prof`: After opening, jemalloc will generate a heap dump file according to 
the current memory usage. There is a small amount of performance loss in heap 
profile sampling, which can be turned off during performance testing.
+`lg_prof_sample`: heap profile sampling interval, the default value is 19, 
that is, the default sampling interval is 512K (2^19 B), which will result in 
only 10% of the memory recorded by the heap profile, `lg_prof_sample:10` can 
reduce the sampling interval to 1K (2^10 B), more frequent sampling will make 
the heap profile close to real memory, but this will bring greater performance 
loss.
+
+For detailed parameter description, refer to 
https://linux.die.net/man/3/jemalloc.
+
+##### 2. jemalloc heap dump profiling
+
+1. A single heap dump file generates plain text analysis results
+    ```shell
+    jeprof lib/doris_be heap_dump_file_1
+    ```
+
+2. Analyze the diff of two heap dumps
+    ```shell
+    jeprof lib/doris_be --base=heap_dump_file_1 heap_dump_file_2
+    ```
+
+3. Generate a call relationship picture
+
+    Install dependencies required for plotting
+    ```shell
+    yum install ghostscript graphviz
+    ```
+    Multiple dump files can be generated by running the above command multiple 
times in a short period of time, and the first dump file can be selected as the 
baseline for diff comparison analysis
+
+    ```shell
+    jeprof --dot lib/doris_be --base=heap_dump_file_1 heap_dump_file_2
+    ```
+    After executing the above command, the terminal will output the graph of 
dot syntax, and paste it to the [online dot drawing 
website](http://www.webgraphviz.com/), generate a memory allocation graph, and 
then analyze it. This method can Drawing directly through the terminal output 
results is more suitable for servers where file transfer is not very convenient.
+
+    You can also use the following command to directly generate the call 
relationship result.pdf file and transfer it to the local for viewing
+    ```shell
+    jeprof --pdf lib/doris_be --base=heap_dump_file_1 heap_dump_file_2 > 
result.pdf
+    ```
+
+###### 3. heap dump by JEMALLOC_CONF
+Periodic heap dump can also be done by changing the `JEMALLOC_CONF` variable 
in `start_be.sh` and restarting BE
+
+1. Dump every 1MB:
+
+    Two new variable settings `prof:true,lg_prof_interval:20` have been added 
to the `JEMALLOC_CONF` variable, where `prof:true` is to enable profiling, and 
`lg_prof_interval:20` means that a dump is generated every 1MB (2^20)
+2. Dump each time a new high is reached:
+
+    Two new variable settings `prof:true,prof_gdump:true` have been added to 
the `JEMALLOC_CONF` variable, where `prof:true` is to enable profiling, and 
`prof_gdump:true` means to generate a dump when the memory usage reaches a new 
high
+3. Memory leak dump when the program exits:
+
+    Added three new variable settings `prof_leak: true, lg_prof_sample: 0, 
prof_final: true` in the `JEMALLOC_CONF` variable.
+
 #### LSAN
 
 
[LSAN](https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer) 
is an address checking tool, GCC has been integrated. When we compile the code, 
we can enable this function by turning on the corresponding compilation 
options. When the program has a determinable memory leak, it prints the leak 
stack. Doris be has integrated this tool, only need to compile with the 
following command to generate be binary with memory leak detection version.
diff --git a/docs/zh-CN/community/developer-guide/debug-tool.md 
b/docs/zh-CN/community/developer-guide/debug-tool.md
index 29278d1757..435005bc1b 100644
--- a/docs/zh-CN/community/developer-guide/debug-tool.md
+++ b/docs/zh-CN/community/developer-guide/debug-tool.md
@@ -103,9 +103,15 @@ FE 是 Java 进程。这里只列举一下简单常用的 java 调试命令。
 
 
对于内存的调试一般分为两个方面。一个是内存使用的总量是否合理,内存使用量过大一方面可能是由于系统存在内存泄露,另一方面可能是因为程序内存使用不当。其次就是是否存在内存越界、非法访问的问题,比如程序访问一个非法地址的内存,使用了未初始化内存等。对于内存方面的调试我们一般使用如下几种方式来进行问题追踪。
 
+Doris 1.2.1 及之前版本使用 TCMalloc,Doris 1.2.2 版本开始默认使用 Jemalloc,根据使用的 Doris 
版本选择内存调试方法,如需切换 TCMalloc 可以这样编译 `USE_JEMALLOC=OFF sh build.sh --be`。
+
 #### 查看日志
 
-当发现内存使用量过大的时候,我们可以先查看be.out日志,看看是否有大内存申请。由于Doris当前使用的TCMalloc管理内存,那么遇到大内存申请时,都会将申请的堆栈打印到be.out文件中,一般的表现形式如下:
+当发现内存使用量过大的时候,我们可以先查看 BE 日志,看看是否有大内存申请。
+
+###### TCMalloc
+
+当使用 TCMalloc 时,遇到大内存申请会将申请的堆栈打印到be.out文件中,一般的表现形式如下:
 
 ```
 tcmalloc: large alloc 1396277248 bytes == 0x3f3488000 @  0x2af6f63 0x2c4095b 
0x134d278 0x134bdcb 0x133d105 0x133d1d0 0x19930ed
@@ -125,8 +131,24 @@ $ addr2line -e lib/doris_be  0x2af6f63 0x2c4095b 0x134d278 
0x134bdcb 0x133d105 0
 thread.cpp:?
 ```
 
+##### JEMALLOC
+
+Doris绝大多数的大内存申请都使用 Allocator,比如 
HashTable、数据序列化,这部分内存申请是预期中的,会被有效管理起来,除此之外的大内存申请不被预期,会将申请的堆栈打印到 be.INFO 
文件中,这通常用于调试,一般的表现形式如下:
+```
+MemHook alloc large memory: 8.2GB, stacktrace:
+Alloc Stacktrace:
+    @     0x55a6a5cf6b4d  doris::ThreadMemTrackerMgr::consume()
+    @     0x55a6a5cf99bf  malloc
+    @     0x55a6ae0caf98  operator new()
+    @     0x55a6a57cb013  doris::segment_v2::PageIO::read_and_decompress_page()
+    @     0x55a6a57719c0  doris::segment_v2::ColumnReader::read_page()
+    ……
+```
+
 #### HEAP PROFILE
 
+##### TCMalloc
+
 有时内存的申请并不是大内存的申请导致,而是通过小内存不断的堆积导致的。那么就没有办法通过查看日志定位到具体的申请信息,那么就需要通过其他方式来获得信息。
 
 这个时候我们可以利用TCMalloc的[HEAP 
PROFILE](https://gperftools.github.io/gperftools/heapprofile.html)的功能。如果设置了HEAPPROFILE功能,那么我们可以获得进程整体的内存申请使用情况。使用方式是在启动Doris
 BE前设置`HEAPPROFILE`环境变量。比如:
@@ -172,7 +194,7 @@ pprof --svg lib/doris_be /tmp/doris_be.hprof.0012.heap > 
heap.svg
 
 **注意:开启这个选项是要影响程序的执行性能的,请慎重对线上的实例开启**
 
-#### pprof remote server
+###### pprof remote server
 
 HEAP PROFILE虽然能够获得全部的内存使用信息,但是也有比较受限的地方。1. 需要重启BE进行。2. 
需要一直开启这个命令,导致对整个进程的性能造成影响。
 
@@ -199,9 +221,9 @@ Total: 1296.4 MB
 
 这个命令的输出与HEAP PROFILE的输出及查看方式一样,这里就不再详细说明。这个命令只有在执行的过程中才会开启统计,相比HEAP 
PROFILE对于进程性能的影响有限。
 
-#### JEMALLOC HEAP PROFILE
+##### JEMALLOC
 
-##### 1. runtime heap dump by http 
+###### 1. runtime heap dump by http
 在`start_be.sh` 中`JEMALLOC_CONF` 增加 `,prof:true,lg_prof_sample:10` 
并重启BE,然后使用jemalloc heap dump http接口,在对应的BE机器上生成heap dump文件。
 
 heap dump文件所在目录可以在 ``be.conf`` 
中通过``jeprofile_dir``变量进行配置,默认为``${DORIS_HOME}/log``
@@ -215,7 +237,7 @@ curl http://be_host:be_webport/jeheap/dump
 
 详细参数说明参考 https://linux.die.net/man/3/jemalloc。
 
-#### 2. jemalloc heap dump profiling
+##### 2. jemalloc heap dump profiling
 
 1.  单个heap dump文件生成纯文本分析结果
 ```shell
@@ -245,8 +267,8 @@ curl http://be_host:be_webport/jeheap/dump
    jeprof --pdf lib/doris_be --base=heap_dump_file_1 heap_dump_file_2 > 
result.pdf
    ```
 
-##### 3. heap dump by JEMALLOC_CONF
-通过更改`start_be.sh` 中`JEMALLOC_CONF` 变量后重新启动BE 来进行heap dump
+###### 3. heap dump by JEMALLOC_CONF
+也可通过更改`start_be.sh` 中`JEMALLOC_CONF` 变量后重新启动 BE 来进行定期 heap dump
 
 1. 每1MB dump一次:
 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to