> -----Original Message----- > From: Lazarenko, Vlad (WorldQuant) > [mailto:[email protected]] > Sent: Thursday, March 1, 2018 10:53 PM > To: Tan, Jianfeng; '[email protected]' > Subject: RE: Multi-process recovery (is it even possible?) > > Hello Jianfeng, > > Thanks for getting back to me. I thought about using "udata64", too. But that > didn't work for me if a single packet was fanned out to multiple slave > processes. But most importantly, it looks like if a slave process crashes > somewhere in the middle of getting or putting packets from/to a pool, we > could end up with a deadlock. So I guess I'd have to think about a different > design or be ready to bounce all of the processes if one of them fails.
OK, a better design to avoid such hard issue is good way to go. Good luck! Thanks, Jianfeng > > Thanks, > Vlad > > > -----Original Message----- > > From: Tan, Jianfeng [mailto:[email protected]] > > Sent: Thursday, March 01, 2018 3:20 AM > > To: Lazarenko, Vlad (WorldQuant); '[email protected]' > > Subject: RE: Multi-process recovery (is it even possible?) > > > > > > > > > -----Original Message----- > > > From: users [mailto:[email protected]] On Behalf Of Lazarenko, > > > Vlad > > > (WorldQuant) > > > Sent: Thursday, March 1, 2018 2:54 AM > > > To: '[email protected]' > > > Subject: [dpdk-users] Multi-process recovery (is it even possible?) > > > > > > Guys, > > > > > > I am looking for possible solutions for the following problems that > > > come along with asymmetric multi-process architecture... > > > > > > Given multiple processes share the same RX/TX queue(s) and packet > > > pool(s) and the possibility of one packet from RX queue being fanned > > > out to multiple slave processes, is there a way to recover from slave > > > crashing (or exits w/o cleaning up properly)? In theory it could have > > > incremented mbuf reference count more than once and unless > everything > > > is restarted, I don't see a reliable way to release those mbufs back to > > > the > > pool. > > > > Recycle an element is too difficult; from what I know, it's next to > > impossible. > > To recycle a memzone/mempool is easier. So in your case, you might want > to > > use different pools for different queues (processes). > > > > If you really want to recycle an element, rte_mbuf in your case, it might be > > doable by: > > 1. set up rx callback for each process, and in the callback, store a > > special flag > > at rte_mbuf->udata64. > > 2. when the primary to detect a secondary is down, we iterate all element > > with the special flag, and put them back into the ring. > > > > There is small chance to fail that , mbuf is allocated by a secondary > > process, > > and before it's flagged, it crashes. > > > > Thanks, > > Jianfeng > > > > > > > > > > Also, if spinlock is involved and either master or slave crashes, > > > everything simply gets stuck. Is there any way to detect this (i.e. > > > outside > of > > data path)..? > > > > > > Thanks, > > > Vlad > > > > > > > ########################################################## > ######################### > > The information contained in this communication is confidential, may be > > subject to legal privilege, and is intended only for the individual named. > > If you are not the named addressee, please notify the sender immediately > and > > delete this email from your system. The views expressed in this email are > > the views of the sender only. Outgoing and incoming electronic > communications > > to this address are electronically archived and subject to review and/or > disclosure > > to someone other than the recipient. > > ########################################################## > #########################
