Thanks. I applied the workaround to .vmx and rebooted all VMs. No more freeze!
On Sun, Jan 21, 2018 at 3:43 PM, Nick Fisk <n...@fisk.me.uk> wrote: > How up to date is your VM environment? We saw something very similar last > year with Linux VM’s running newish kernels. It turns out newer kernels > supported a new feature of the vmxnet3 adapters which had a bug in ESXi. > The fix was release last year some time in ESXi6.5 U1, or a workaround was > to set an option in the VM config. > > > > https://kb.vmware.com/s/article/2151480 > > > > > > > > *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf > Of *Youzhong Yang > *Sent:* 21 January 2018 19:50 > *To:* Brad Hubbard <bhubb...@redhat.com> > *Cc:* ceph-users <ceph-users@lists.ceph.com> > *Subject:* Re: [ceph-users] Ubuntu 17.10 or Debian 9.3 + Luminous = > random OS hang ? > > > > As someone suggested, I installed linux-generic-hwe-16.04 package on > Ubuntu 16.04 to get kernel of 17.10, and then rebooted all VMs, here is > what I observed: > > - ceph monitor node froze upon reboot, in another case froze after a few > minutes > > - ceph OSD hosts easily froze > > - ceph admin node (which runs no ceph service but ceph-deploy) never > freezes > > - ceph rgw nodes and ceph mgr so far so good > > > > Here are two images I captured: > > > > https://drive.google.com/file/d/11hMJwhCF6Tj8LD3nlpokG0CB_ > oZqI506/view?usp=sharing > > https://drive.google.com/file/d/1tzDQ3DYTnfDHh_ > hTQb0ISZZ4WZdRxHLv/view?usp=sharing > > > > Thanks. > > > > On Sat, Jan 20, 2018 at 7:03 PM, Brad Hubbard <bhubb...@redhat.com> wrote: > > On Fri, Jan 19, 2018 at 11:54 PM, Youzhong Yang <youzh...@gmail.com> > wrote: > > I don't think it's hardware issue. All the hosts are VMs. By the way, > using > > the same set of VMWare hypervisors, I switched back to Ubuntu 16.04 last > > night, so far so good, no freeze. > > Too little information to make any sort of assessment I'm afraid but, > at this stage, this doesn't sound like a ceph issue. > > > > > > On Fri, Jan 19, 2018 at 8:50 AM, Daniel Baumann <daniel.baum...@bfh.ch> > > wrote: > >> > >> Hi, > >> > >> On 01/19/18 14:46, Youzhong Yang wrote: > >> > Just wondering if anyone has seen the same issue, or it's just me. > >> > >> we're using debian with our own backported kernels and ceph, works rock > >> solid. > >> > >> what you're describing sounds more like hardware issues to me. if you > >> don't fully "trust"/have confidence in your hardware (and your logs > >> don't reveal anything), I'd recommend running some burn-in tests > >> (memtest, cpuburn, etc.) on them for 24 hours/machine to rule out > >> cpu/ram/etc. issues. > >> > >> Regards, > >> Daniel > >> _______________________________________________ > >> ceph-users mailing list > >> ceph-users@lists.ceph.com > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > -- > Cheers, > Brad > > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com