So I have very limited knowledge on KVM.  But, from my understanding from 
Edison, we should consider what has to be done to fix this problem once it 
occurs.

- Shutdown all VMs on all hosts that are affected.
- umount the nfs mount point
- Reestablish the storage pool.
- Restart the VMs.

Given how severe these actions are to the end user, I would vote for the file 
lock to ensure it never happens, even if it's slower.

--Alex

> -----Original Message-----
> From: Wei ZHOU [mailto:ustcweiz...@gmail.com]
> Sent: Tuesday, July 16, 2013 3:35 AM
> To: dev@cloudstack.apache.org
> Subject: Re: How to fix libvirt storage pool refresh issue?
> 
> I agree with Wido.
> 
> Moreover, the file lock will cause performane degrade of VM deployment.
> 
> -Wei
> 
> 
> 2013/7/16 Wido den Hollander <w...@widodh.nl>
> 
> > On 07/16/2013 12:27 AM, Marcus Sorensen wrote:
> >
> >>     I'm ok with a symptom fix on our end, if the root cause is in
> >> Libvirt we can't do much about that. This is the sort of patch that
> >> tends to get pulled into the regular update cycle of the
> >> distributions, so unless there's more to it and it's not a good fix I
> >> imagine we will see it come through without having to wait for the
> >> next point releases. We still have to support existing users who
> >> might not be running the latest, though, so the symptom fix is
> >> probably ok as a temporary measure.
> >>
> >
> > I'm ok with not calling storagePoolRefresh every time we want a
> > capacity update, since that's also kind of I/O intensive for larger storage
> arrays.
> >
> > However, we should make sure we have a GOOD comment in the code
> about
> > this "fix", since that's the reason I initially removed the old code
> > which invoked "df".
> >
> > I'll see if I can get this libvirt patch into Ubuntu when it hits
> > libvirt upstream, since this bug is really annoying.
> >
> > Wido
> >
> >
> >
> >> On Mon, Jul 15, 2013 at 3:42 PM, Edison Su <edison...@citrix.com> wrote:
> >>
> >>> There is a serious issue on KVM(https://issues.apache.org/**
> >>> jira/browse/CLOUDSTACK-
> 2729<https://issues.apache.org/jira/browse/CLOUDSTACK-2729>):
> >>> a libvirt storage pool can disappear on KVM host, it's easy to be
> >>> reproduced in our internal QA environment.
> >>> Wei found the root cause, is on the libvirt:
> >>> "
> >>> This is a libvirt issue. I created a ticket for it.
> >>> https://bugzilla.redhat.com/**show_bug.cgi?id=977706<https://bugzill
> >>> a.redhat.com/show_bug.cgi?id=977706>
> >>> The patch is very simple.
> >>> https://www.redhat.com/**archives/libvir-list/2013-**July/msg00635.h
> >>> tml<https://www.redhat.com/archives/libvir-list/2013-July/msg00635.h
> >>> tml>
> >>> "
> >>> But it's also introduced by CloudStack, as cloudstack will call
> >>> libvirt storage pool refresh method each time when access the
> >>> storage pool. The code is added by commit:
> >>> 2ffc9907f7b0d371737e39b7649f7a**f23026f5cf,
> >>> about less than one year ago.
> >>>
> >>> As Wei suggested, we can call storage pool refresh only if needed,
> >>> it will mitigate the issue(It's behavior I did on cloudstack
> >>> pre-4.0), but it's only treat the symptom, not the cause.
> >>> Or add a cluster wide lock, only one guy can access storage pool at
> >>> one time, we can add a file lock on NFS primary storage.
> >>> Any idea/feedback on how to fix this KVM issue?
> >>>
> >>>
> >>>
> >>>
> >

Reply via email to