First time posting on this list, but I've gone through a very similar 
exercise the last few months and might have some good insight for you.

Learning how to interpret profiles is extremely useful here.  For example, 
capturing a cpu profile, heap profile, and trace will all provide different 
facets of what's going on under the hood.  Luckily since you have high CPU 
usage the cpu profile alone will still be very useful.

For a couple low hanging fruit: armed with a CPU and heap profile, take a 
look at both.  You say the runtime and gc dominate the CPU pprof: this 
likely points to memory issues as you mentioned.  Open up a heap profile, 
switch the `sample_index` to `alloc_space` or `alloc_objects`, and take a 
look to see who the largest offenders are.  For a more clear pointer to the 
offending code's callstack, set `call_tree`, then take another look.

I believe that spending a few hours/days on learning the pprof and trace 
tools would pay dividends given the scope of your task.  It's hard to give 
any more detailed suggestions for performance while flying blind.  
Personally when I had a similar looking pprof, the two low hanging fruit 
were goroutine management (some were lasting longer than expected), and 
reduced memory usage (work / allocation was needlessly duplicated in 
several places).

if you truly have such a large amount of in use objects (noted by 
inuse_space / inuse_objects in the heap's pprof), then I agree some form of 
sync.Pool or memory reuse may be beneficial.

On Friday, July 16, 2021 at 5:27:03 AM UTC-7 rmfr wrote:

> I run it at an 8 cores 16GB machine and it occupies all cpu cores it could.
>
> 1. It is ~95% cpu intensive and with ~5% network communications.
>
> 2. The codebase is huge and has more than 300 thousands of lines of code 
> (even didn't count the third party library yet).
>
> 3. The tool pprof tells nearly 50% percent of the time is spending on the 
> runtime, something related to gc, mallocgc, memclrNoHeapPointers, and so on.
>
> 4. It has ~100 million dynamic objects.
>
> Do you guys have some good advice to optimize the performance?
>
> One idea that occurs to me is to do something like sync.Pool to buffer 
> some most frequently allocated and freed objects. But the problem is I 
> didn't manage to find a golang tool to find such objects. The runtime 
> provides api to get the amount of objects but it didn't provide api to get 
> the detailed statistics of all objects. Please correct me if I'm wrong. 
> Thanks a lot :-)
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/8fb0d960-2e03-4587-aca1-84e84215641an%40googlegroups.com.

Reply via email to