Hi,

On 2017/10/19 18:02, Ananyev, Konstantin wrote:

Hi Jia,


Hi


On 10/13/2017 9:02 AM, Jia He Wrote:
Hi Jerin


On 10/13/2017 1:23 AM, Jerin Jacob Wrote:
-----Original Message-----
Date: Thu, 12 Oct 2017 17:05:50 +0000

[...]
On the same lines,

Jia He, jie2.liu, bing.zhao,

Is this patch based on code review or do you saw this issue on any of
the
arm/ppc target? arm64 will have performance impact with this change.
sorry, miss one important information
Our platform is an aarch64 server with 46 cpus.
If we reduced the involved cpu numbers, the bug occurred less frequently.

Yes, mb barrier impact the performance, but correctness is more
important, isn't it ;-)
Maybe we canĀ  find any other lightweight barrier here?

Cheers,
Jia
Based on mbuf_autotest, the rte_panic will be invoked in seconds.

PANIC in test_refcnt_iter():
(lcore=0, iter=0): after 10s only 61 of 64 mbufs left free
1: [./test(rte_dump_stack+0x38) [0x58d868]]
Aborted (core dumped)


So is it only reproducible with mbuf refcnt test?
Could it be reproduced with some 'pure' ring test
(no mempools/mbufs refcnt, etc.)?
The reason I am asking - in that test we also have mbuf refcnt updates
(that's what for that test was created) and we are doing some optimizations 
here too
to avoid excessive atomic updates.
BTW, if the problem is not reproducible without mbuf refcnt,
can I suggest to extend the test  with:
   - add a check that enqueue() operation was successful
   - walk through the pool and check/printf refcnt of each mbuf.
Hopefully that would give us some extra information what is going wrong here.
Konstantin
Currently, the issue is only found in this case here on the ARM platform, not sure how it is going with the X86_64 platform. In another mail of this thread, we've made a simple test based on this and captured some information and I pasted there.(I pasted the patch there :-)) And it seems that Juhamatti & Jacod found some reverting action several months ago.

BR. Bing

Reply via email to