On Sun, Mar 20, 2016 at 9:41 AM, William Tu <u9012...@gmail.com> wrote: > > Hi Han Zhou, > > Just curious and not related to the bitwise_rscan(). > Do you get a chance to know what this kernel symbol is?
Here is the report with kernel symbols resolved. --- before optimization --- + 36.27% ovn-controller ovn-controller [.] bitwise_rscan + 5.24% ovn-controller libc-2.19.so [.] _int_malloc + 4.08% ovn-controller libc-2.19.so [.] __memcmp_sse4_1 + 3.68% ovn-controller libc-2.19.so [.] _int_free + 2.99% ovn-controller ovn-controller [.] lex_token_parse + 2.55% ovn-controller ovn-controller [.] flow_wildcards_hash + 2.48% ovn-controller ovn-controller [.] match_hash + 2.07% ovn-controller ovn-controller [.] ofctrl_add_flow + 1.68% ovn-controller ovn-controller [.] ovn_flow_lookup + 1.45% ovn-controller [kernel.kallsyms] [k] clear_page_c_e --- after optimization --- + 8.46% ovn-controller libc-2.19.so [.] _int_malloc + 5.97% ovn-controller ovn-controller [.] bitwise_rscan + 5.95% ovn-controller libc-2.19.so [.] _int_free + 4.83% ovn-controller ovn-controller [.] lex_token_parse + 3.91% ovn-controller ovn-controller [.] match_hash + 3.87% ovn-controller ovn-controller [.] flow_wildcards_hash + 3.12% ovn-controller ovn-controller [.] ofctrl_add_flow + 2.41% ovn-controller [kernel.kallsyms] [k] clear_page_c_e The top kernel symbol is "clear_page_c_e". > > btw, I remembered you have tried jemalloc. Does jemalloc have lower __int_malloc/free number? There is no __int_malloc/free seen any more with jemalloc, but replaced by its own malloc/frees. Besides, clear_page_c_e popped up: --- perf report with both bitwise_rscan optimization and jemalloc enabled --- + 6.07% ovn-controller [kernel.kallsyms] [k] clear_page_c_e + 5.67% ovn-controller libjemalloc.so.1 [.] free + 5.60% ovn-controller ovn-controller [.] bitwise_rscan + 4.28% ovn-controller libjemalloc.so.1 [.] 0x000000000000be83 + 3.85% ovn-controller ovn-controller [.] lex_token_parse + 3.69% ovn-controller ovn-controller [.] flow_wildcards_hash + 3.68% ovn-controller ovn-controller [.] match_hash + 3.54% ovn-controller ovn-controller [.] ofctrl_add_flow + 2.05% ovn-controller libjemalloc.so.1 [.] malloc It's hard to tell the jemalloc gains from the perf report, but in our earlier tests the end-to-end testing finished 10% faster. We can verify that observation again with bitwise_rscan optimization. Please let us know your thoughts on this data. Thanks! -- Best regards, Han _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev