[
https://issues.apache.org/jira/browse/DISPATCH-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16854704#comment-16854704
]
Francesco Nigro commented on DISPATCH-1352:
-------------------------------------------
[~kgiusti]
{quote}Instead of allocating a qd_message_pvt_t structure we allocate a single
block of memory large enuff to hold the qd_message_pv_t structure and N
qd_buffer_t structures and lay them down "cheek to jowl" in the buffer, linking
the qd_buffer_ts as normal, but incrementing the refcount to prevent freeing
them individually.{quote}
If we allocate upfront (3*N with N >=1) qd_buffer_t it would help: considering
the context where such clone list will happen, they seems to always contain *at
least* 1 qd_buffer_it, so allocating them upfront makes totally sense.
And the lifecycle of such qd_buffer_t is already bounded to qd_message_pvt_t
one.
About the code impact; that's a whole different story: we need to recognize the
"embedded" qd_buffer_t while freeing qd_buffer_list_t to avoid deallocating
them.
{quote}That would avoid the extra calls to qd_buffer_t allocate, make better
use of the cache (fingers crossed) all without having to touch the iterator
code (which is everywhere and expects qd_buffer_t based data).{quote}
That's my bet too :)
Fingers crossed!
> qd_buffer_list_clone cost is dominated by cache misses
> ------------------------------------------------------
>
> Key: DISPATCH-1352
> URL: https://issues.apache.org/jira/browse/DISPATCH-1352
> Project: Qpid Dispatch
> Issue Type: Improvement
> Components: Routing Engine
> Affects Versions: 1.7.0
> Reporter: Francesco Nigro
> Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> qd_buffer_list_clone on qd_message_copy for
> qd_message_pvt_t.ma_to_override/ma_trace/ma_ingress is dominated by cache
> misses costs:
> * to "allocate" new qd_buffer_t
> * to reference any qd_buffer_t from the source qd_buffer_list_t
> Such cost is the main reason why the core thread is having a very low IPC (<
> 1 istr/cycle) and given the single threaded nature of the router while
> dealing with it, by solving it will bring a huge performance improvement to
> make the router able to scale better.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]