This is an automated email from the ASF dual-hosted git repository.
wusheng pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/skywalking.git
The following commit(s) were added to refs/heads/master by this push:
new 2d97e24ec5 Add docs for profiling, and adjust menu items. (#10046)
2d97e24ec5 is described below
commit 2d97e24ec5c8b8073d3e7310202180ee0bf6ebf1
Author: 吴晟 Wu Sheng <[email protected]>
AuthorDate: Tue Nov 29 17:27:02 2022 +0800
Add docs for profiling, and adjust menu items. (#10046)
---
docs/en/changes/changes.md | 1 +
docs/en/concepts-and-designs/profiling.md | 82 +++++++++++++++++++++++++++++++
docs/menu.yml | 10 ++--
3 files changed, 89 insertions(+), 4 deletions(-)
diff --git a/docs/en/changes/changes.md b/docs/en/changes/changes.md
index a615c064b7..412de8a128 100644
--- a/docs/en/changes/changes.md
+++ b/docs/en/changes/changes.md
@@ -197,5 +197,6 @@
* Add new docs for `Report Span Attached Events` data collecting protocol.
* Add new docs for `Record` query protocol
* Update `Server Agents` and `Compatibility` for PHP agent.
+* Add docs for profiling.
All issues and pull requests are
[here](https://github.com/apache/skywalking/milestone/149?closed=1)
diff --git a/docs/en/concepts-and-designs/profiling.md
b/docs/en/concepts-and-designs/profiling.md
new file mode 100644
index 0000000000..d198a372aa
--- /dev/null
+++ b/docs/en/concepts-and-designs/profiling.md
@@ -0,0 +1,82 @@
+# Profiling
+
+The profiling is an on-demand diagnosing method to locate bottleneck of the
services.
+These typical scenarios usually are suitable for profiling through various
profiling tools
+
+1. Some methods slow down the API performance.
+2. Too many threads and/or high-frequency I/O per OS process reduce the CPU
efficiency.
+3. Massive RPC requests block the network to cause responding slowly.
+4. Unexpected network requests caused by security issues or codes' bug.
+
+In the SkyWalking landscape, we provided two ways to support profiling within
reasonable resource cost.
+
+1. In-process profiling is bundled with auto-instrument agents.
+2. Out-of-process profiling is powered by eBPF agent.
+
+## In-process profiling
+
+In-process profiling is primarily provided by auto-instrument agents in the
VM-based runtime.
+This feature resolves the issue <1> through capture the snapshot of the thread
stacks periodically.
+The OAP would aggregate the thread stack per RPC request, and provide a
hierarchy graph to indicate the slow methods
+based
+on continuous snapshot.
+
+The period is usually every 10-100 milliseconds, which is not recommended to
be less, due to this capture would usually
+cause classical stop-the-world for the VM, which would impact the whole
process performance.
+
+Learn more tech details from the post, [**Use Profiling to Fix the Blind Spot
of Distributed
+Tracing**](sdk-profiling.md).
+
+For now, Java and Python agents support this.
+
+## Out-of-process profiling
+
+Out-of-process profiling leverage [eBPF](https://ebpf.io/) technology with
origins in the Linux kernel.
+It provides a way to extend the capabilities of the kernel safely and
efficiently.
+
+### On-CPU Profiling
+
+On-CPU profiling is suitable for analyzing thread stacks when service CPU
usage is high.
+If the stack is dumped more times, it means that the thread stack occupies
more CPU resources.
+
+This is pretty similar with in-process profiling to resolve the issue <1>, but
it is made out-of-process and based on
+Linux eBPF.
+Meanwhile, this is made for languages without VM mechanism, which caused not
supported by in-process agents, such as,
+C/C++, Rust. Golang is a special case, it exposed the metadata of the VM for
eBPF, so, it could be profiled.
+
+### Off-CPU Profiling
+
+Off-CPU profiling is suitable for performance issues that are not caused by
high CPU usage, but may be on high CPU load.
+This profiling aims to resolve the issue <2>.
+
+For example,
+
+1. When there are too many threads in one service, using off-CPU profiling
could reveal which threads spend
+ more time context switching.
+2. Codes heavily rely on disk I/O or remote service performance would slow
down the whole process.
+
+Off-CPU profiling provides two perspectives
+
+1. Thread switch count: The number of times a thread switches context. When
the thread returns to the CPU, it completes
+ one context switch. A thread stack with a higher switch count spends more
time context switching.
+2. Thread switch duration: The time it takes a thread to switch the context. A
thread stack with a higher switch
+ duration spends more time off-CPU.
+
+Learn more tech details about ON/OFF CPU profiling from the post, [**Pinpoint
Service Mesh Critical Performance Impact
+by using eBPF**](ebpf-cpu-profiling.md)
+
+### Network Profiling
+
+Network profiling captures the network packages to analysis traffic at L4(TCP)
and L7(HTTP) to recognize network traffic
+from a specific process or a k8s pod. Through this traffic analysis, locate
the root causes of the issues <3> and <4>.
+
+Network profiling provides
+
+1. Network topology and identify processes.
+2. Observe TCP traffic metrics with TLS status.
+3. Observe HTTP traffic metrics.
+4. Sample HTTP request/response raw data within tracing context.
+5. Observe time costs for local I/O costing on the OS. Such as the time of
Linux process HTTP request/response.
+
+Learn more tech details from the post, [**Diagnose Service Mesh Network
Performance with
+eBPF**](../academy/diagnose-service-mesh-network-performance-with-ebpf.md)
\ No newline at end of file
diff --git a/docs/menu.yml b/docs/menu.yml
index 2f85c091c4..170d98787d 100644
--- a/docs/menu.yml
+++ b/docs/menu.yml
@@ -34,16 +34,18 @@ catalog:
path: "/en/concepts-and-designs/service-agent"
- name: "Manual Instrument SDK"
path: "/en/concepts-and-designs/manual-sdk"
- - name: "Backend"
+ - name: "Observability Analysis Platform"
catalog:
- name: "Overview"
path: "/en/concepts-and-designs/backend-overview"
- - name: "Observability Analysis Language"
+ - name: "Analysis Streaming Traces and Mesh Traffic"
path: "/en/concepts-and-designs/oal"
- - name: "Meter Analysis Language"
+ - name: "Analysis Metrics and Meters"
path: "/en/concepts-and-designs/mal"
- - name: "Log Analysis Language"
+ - name: "Analysis Logs"
path: "/en/concepts-and-designs/lal"
+ - name: "Profiling"
+ path: "/en/setup/backend/profiling"
- name: "Query in OAP"
path: "/en/protocols/readme#query-protocol"
- name: "Event"