On 2014/9/1 16:18, Michael S. Tsirkin wrote: > On Fri, Aug 29, 2014 at 06:40:24PM +0800, Zhangjie (HZ) wrote: >> >> >> On 2014/8/27 20:59, Michael S. Tsirkin wrote: >>> On Thu, Aug 21, 2014 at 03:42:53PM +0800, Zhangjie (HZ) wrote: >>>> On 2014/8/21 14:53, Jason Wang wrote: >>>>> On 08/21/2014 02:28 PM, Zhangjie (HZ) wrote: >>>>>> >>>>>> After migration, vhost is not disabled, virtual nic became unreachable >>>>>> because vhost is not awakened. >>>>>> By the logical of EVENT_IDX, virtio-net will not kick vhost again if the >>>>>> used idx is not updated. >>>>>> So, if one interrupts is lost during migration, virtio_net will not kick >>>>>> vhost again. >>>>>> Then, no skb from virtio-net can be sent to tap. >>>>> >>>>> Yes and I mean to test vhost=off to see if it was the issue of vhost. >>>> That sounds reasonable, but the test case is to test vhost. >>>>>> >>>>>> Jason's patch reduced the probability of occurrence, from about 1/20 to >>>>>> 1/80. It is really effective. I think the patch should be acked. >>>>>> May be we can try to solve the problem from another perspective. Do you >>>>>> have some methods to sense the migration? >>>>>> We can make up a signal from virtio-net after the migration. >>>>> >>>>> You can make a patch like this to debug. If problem disappears, it means >>>>> interrupt was really lost anyway. >>>>>> >>>>>>> Anyway, I will try to reproduce it by myself. >>>>>>> >>>>>> The test environment is really terrible, I build a environment myself, >>>>>> but it problem did not occur. >>>>>> The environment I use now is from a colleague Responsible for test work. >>>>>> Two hosts, every host has about 20 vms, they send packages(ipv4 and >>>>>> ipv6) between each other. >>>>>> The VM to be migrated also sens packages itself, and there is a ping(-i >>>>>> 0.001) from another host to it. >>>>>> The physical nic is 1GE, connected through a internal nework. >>>>> >>>>> Just want to confirm. For the problem did not occur, you mean with my >>>>> patch on top? >>>>> . >>>>> >>>> I mean, with your patch, I have to test 80 times before it occurs, the >>>> probability is reduced. >>> >>> Could you please try to apply the patch >>> [PATCH V4] net: Forbid dealing with packets when VM is not running >>> on top and see if this helps? >>> >>> Thanks! >>> >>>> -- >>>> Best Wishes! >>>> Zhang Jie >>> . >>> >> Thanks! I will have a test. > > Great, once you have the result of the two patches applied > together, please let us know on the list. > > >> -- >> Best Wishes! >> Zhang Jie > . >
I'm sorry for not giving test results in time, test resource is busy these days. Perhaps I can give the result next week. -- Best Wishes! Zhang Jie