Re: [ceph-users] Jewel upgrade - rbd errors after upgrade
Centos 7 - the ugrade was done simply with "yum update -y ceph" on each node one by one, so the package order would have been determined by yum. From: Jason Dillaman Sent: Monday, June 6, 2016 10:42 PM To: Adrian Saul Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Jewel upgrade - rbd errors after upgrade What OS are you using? It actually sounds like the plugins were updated, the Infernalis OSD was reset, and then the Jewel OSD was installed. On Sun, Jun 5, 2016 at 10:42 PM, Adrian Saul wrote: > > Thanks Jason. > > I don’t have anything specified explicitly for osd class dir. I suspect it > might be related to the OSDs being restarted during the package upgrade > process before all libraries are upgraded. > > >> -Original Message- >> From: Jason Dillaman [mailto:jdill...@redhat.com] >> Sent: Monday, 6 June 2016 12:37 PM >> To: Adrian Saul >> Cc: ceph-users@lists.ceph.com >> Subject: Re: [ceph-users] Jewel upgrade - rbd errors after upgrade >> >> Odd -- sounds like you might have Jewel and Infernalis class objects and >> OSDs intermixed. I would double-check your installation and see if your >> configuration has any overload for "osd class dir". >> >> On Sun, Jun 5, 2016 at 10:28 PM, Adrian Saul >> wrote: >> > >> > I have traced it back to an OSD giving this error: >> > >> > 2016-06-06 12:18:14.315573 7fd714679700 -1 osd.20 23623 class rbd open >> > got (5) Input/output error >> > 2016-06-06 12:19:49.835227 7fd714679700 0 _load_class could not open >> > class /usr/lib64/rados-classes/libcls_rbd.so (dlopen failed): >> > /usr/lib64/rados-classes/libcls_rbd.so: undefined symbol: >> > _ZN4ceph6buffer4list8iteratorC1EPS1_j >> > >> > Trying to figure out why that is the case. >> > >> > >> >> -Original Message- >> >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf >> >> Of Adrian Saul >> >> Sent: Monday, 6 June 2016 11:11 AM >> >> To: dilla...@redhat.com >> >> Cc: ceph-users@lists.ceph.com >> >> Subject: Re: [ceph-users] Jewel upgrade - rbd errors after upgrade >> >> >> >> >> >> No - it throws a usage error - if I add a file argument after it works: >> >> >> >> [root@ceph-glb-fec-02 ceph]# rados -p glebe-sata get >> >> rbd_id.hypervtst- >> >> lun04 /tmp/crap >> >> [root@ceph-glb-fec-02 ceph]# cat /tmp/crap 109eb01f5f89de >> >> >> >> stat works: >> >> >> >> [root@ceph-glb-fec-02 ceph]# rados -p glebe-sata stat >> >> rbd_id.hypervtst- >> >> lun04 >> >> glebe-sata/rbd_id.hypervtst-lun04 mtime 2016-06-06 10:55:08.00, >> >> size 18 >> >> >> >> >> >> I can do a rados ls: >> >> >> >> [root@ceph-glb-fec-02 ceph]# rados ls -p glebe-sata|grep rbd_id >> >> rbd_id.cloud2sql-lun01 >> >> rbd_id.glbcluster3-vm17 >> >> rbd_id.holder <<< a create that said it failed while I was debugging >> >> this >> >> rbd_id.pvtcloud-nfs01 >> >> rbd_id.hypervtst-lun05 >> >> rbd_id.test02 >> >> rbd_id.cloud2sql-lun02 >> >> rbd_id.fiotest2 >> >> rbd_id.radmast02-lun04 >> >> rbd_id.hypervtst-lun04 >> >> rbd_id.cloud2fs-lun00 >> >> rbd_id.radmast02-lun03 >> >> rbd_id.hypervtst-lun00 >> >> rbd_id.cloud2sql-lun00 >> >> rbd_id.radmast02-lun02 >> >> >> >> >> >> > -Original Message- >> >> > From: Jason Dillaman [mailto:jdill...@redhat.com] >> >> > Sent: Monday, 6 June 2016 11:00 AM >> >> > To: Adrian Saul >> >> > Cc: ceph-users@lists.ceph.com >> >> > Subject: Re: [ceph-users] Jewel upgrade - rbd errors after upgrade >> >> > >> >> > Are you able to successfully run the following command successfully? >> >> > >> >> > rados -p glebe-sata get rbd_id.hypervtst-lun04 >> >> > >> >> > >> >> > >> >> > On Sun, Jun 5, 2016 at 8:49 PM, Adrian Saul >> >> > wrote: >> >> > > >> >> > > I upgraded my Infernalis semi-production cluster to Jewel on Friday. >> >> > > While >> >> > the upgrade went through smoothly (aside from a time wasting >> >> > restorecon /var
Re: [ceph-users] Jewel upgrade - rbd errors after upgrade
What OS are you using? It actually sounds like the plugins were updated, the Infernalis OSD was reset, and then the Jewel OSD was installed. On Sun, Jun 5, 2016 at 10:42 PM, Adrian Saul wrote: > > Thanks Jason. > > I don’t have anything specified explicitly for osd class dir. I suspect it > might be related to the OSDs being restarted during the package upgrade > process before all libraries are upgraded. > > >> -Original Message- >> From: Jason Dillaman [mailto:jdill...@redhat.com] >> Sent: Monday, 6 June 2016 12:37 PM >> To: Adrian Saul >> Cc: ceph-users@lists.ceph.com >> Subject: Re: [ceph-users] Jewel upgrade - rbd errors after upgrade >> >> Odd -- sounds like you might have Jewel and Infernalis class objects and >> OSDs intermixed. I would double-check your installation and see if your >> configuration has any overload for "osd class dir". >> >> On Sun, Jun 5, 2016 at 10:28 PM, Adrian Saul >> wrote: >> > >> > I have traced it back to an OSD giving this error: >> > >> > 2016-06-06 12:18:14.315573 7fd714679700 -1 osd.20 23623 class rbd open >> > got (5) Input/output error >> > 2016-06-06 12:19:49.835227 7fd714679700 0 _load_class could not open >> > class /usr/lib64/rados-classes/libcls_rbd.so (dlopen failed): >> > /usr/lib64/rados-classes/libcls_rbd.so: undefined symbol: >> > _ZN4ceph6buffer4list8iteratorC1EPS1_j >> > >> > Trying to figure out why that is the case. >> > >> > >> >> -Original Message- >> >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf >> >> Of Adrian Saul >> >> Sent: Monday, 6 June 2016 11:11 AM >> >> To: dilla...@redhat.com >> >> Cc: ceph-users@lists.ceph.com >> >> Subject: Re: [ceph-users] Jewel upgrade - rbd errors after upgrade >> >> >> >> >> >> No - it throws a usage error - if I add a file argument after it works: >> >> >> >> [root@ceph-glb-fec-02 ceph]# rados -p glebe-sata get >> >> rbd_id.hypervtst- >> >> lun04 /tmp/crap >> >> [root@ceph-glb-fec-02 ceph]# cat /tmp/crap 109eb01f5f89de >> >> >> >> stat works: >> >> >> >> [root@ceph-glb-fec-02 ceph]# rados -p glebe-sata stat >> >> rbd_id.hypervtst- >> >> lun04 >> >> glebe-sata/rbd_id.hypervtst-lun04 mtime 2016-06-06 10:55:08.00, >> >> size 18 >> >> >> >> >> >> I can do a rados ls: >> >> >> >> [root@ceph-glb-fec-02 ceph]# rados ls -p glebe-sata|grep rbd_id >> >> rbd_id.cloud2sql-lun01 >> >> rbd_id.glbcluster3-vm17 >> >> rbd_id.holder <<< a create that said it failed while I was debugging >> >> this >> >> rbd_id.pvtcloud-nfs01 >> >> rbd_id.hypervtst-lun05 >> >> rbd_id.test02 >> >> rbd_id.cloud2sql-lun02 >> >> rbd_id.fiotest2 >> >> rbd_id.radmast02-lun04 >> >> rbd_id.hypervtst-lun04 >> >> rbd_id.cloud2fs-lun00 >> >> rbd_id.radmast02-lun03 >> >> rbd_id.hypervtst-lun00 >> >> rbd_id.cloud2sql-lun00 >> >> rbd_id.radmast02-lun02 >> >> >> >> >> >> > -Original Message- >> >> > From: Jason Dillaman [mailto:jdill...@redhat.com] >> >> > Sent: Monday, 6 June 2016 11:00 AM >> >> > To: Adrian Saul >> >> > Cc: ceph-users@lists.ceph.com >> >> > Subject: Re: [ceph-users] Jewel upgrade - rbd errors after upgrade >> >> > >> >> > Are you able to successfully run the following command successfully? >> >> > >> >> > rados -p glebe-sata get rbd_id.hypervtst-lun04 >> >> > >> >> > >> >> > >> >> > On Sun, Jun 5, 2016 at 8:49 PM, Adrian Saul >> >> > wrote: >> >> > > >> >> > > I upgraded my Infernalis semi-production cluster to Jewel on Friday. >> >> > > While >> >> > the upgrade went through smoothly (aside from a time wasting >> >> > restorecon /var/lib/ceph in the selinux package upgrade) and the >> >> > services continued running without interruption. However this >> >> > morning when I went to create some new RBD images I am unable to do >> >> > much at all >> >> with RBD. >> >> > > >> >> > > Just about any rbd comm
Re: [ceph-users] Jewel upgrade - rbd errors after upgrade
Thanks Jason. I don’t have anything specified explicitly for osd class dir. I suspect it might be related to the OSDs being restarted during the package upgrade process before all libraries are upgraded. > -Original Message- > From: Jason Dillaman [mailto:jdill...@redhat.com] > Sent: Monday, 6 June 2016 12:37 PM > To: Adrian Saul > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] Jewel upgrade - rbd errors after upgrade > > Odd -- sounds like you might have Jewel and Infernalis class objects and > OSDs intermixed. I would double-check your installation and see if your > configuration has any overload for "osd class dir". > > On Sun, Jun 5, 2016 at 10:28 PM, Adrian Saul > wrote: > > > > I have traced it back to an OSD giving this error: > > > > 2016-06-06 12:18:14.315573 7fd714679700 -1 osd.20 23623 class rbd open > > got (5) Input/output error > > 2016-06-06 12:19:49.835227 7fd714679700 0 _load_class could not open > > class /usr/lib64/rados-classes/libcls_rbd.so (dlopen failed): > > /usr/lib64/rados-classes/libcls_rbd.so: undefined symbol: > > _ZN4ceph6buffer4list8iteratorC1EPS1_j > > > > Trying to figure out why that is the case. > > > > > >> -Original Message- > >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf > >> Of Adrian Saul > >> Sent: Monday, 6 June 2016 11:11 AM > >> To: dilla...@redhat.com > >> Cc: ceph-users@lists.ceph.com > >> Subject: Re: [ceph-users] Jewel upgrade - rbd errors after upgrade > >> > >> > >> No - it throws a usage error - if I add a file argument after it works: > >> > >> [root@ceph-glb-fec-02 ceph]# rados -p glebe-sata get > >> rbd_id.hypervtst- > >> lun04 /tmp/crap > >> [root@ceph-glb-fec-02 ceph]# cat /tmp/crap 109eb01f5f89de > >> > >> stat works: > >> > >> [root@ceph-glb-fec-02 ceph]# rados -p glebe-sata stat > >> rbd_id.hypervtst- > >> lun04 > >> glebe-sata/rbd_id.hypervtst-lun04 mtime 2016-06-06 10:55:08.00, > >> size 18 > >> > >> > >> I can do a rados ls: > >> > >> [root@ceph-glb-fec-02 ceph]# rados ls -p glebe-sata|grep rbd_id > >> rbd_id.cloud2sql-lun01 > >> rbd_id.glbcluster3-vm17 > >> rbd_id.holder <<< a create that said it failed while I was debugging > >> this > >> rbd_id.pvtcloud-nfs01 > >> rbd_id.hypervtst-lun05 > >> rbd_id.test02 > >> rbd_id.cloud2sql-lun02 > >> rbd_id.fiotest2 > >> rbd_id.radmast02-lun04 > >> rbd_id.hypervtst-lun04 > >> rbd_id.cloud2fs-lun00 > >> rbd_id.radmast02-lun03 > >> rbd_id.hypervtst-lun00 > >> rbd_id.cloud2sql-lun00 > >> rbd_id.radmast02-lun02 > >> > >> > >> > -Original Message- > >> > From: Jason Dillaman [mailto:jdill...@redhat.com] > >> > Sent: Monday, 6 June 2016 11:00 AM > >> > To: Adrian Saul > >> > Cc: ceph-users@lists.ceph.com > >> > Subject: Re: [ceph-users] Jewel upgrade - rbd errors after upgrade > >> > > >> > Are you able to successfully run the following command successfully? > >> > > >> > rados -p glebe-sata get rbd_id.hypervtst-lun04 > >> > > >> > > >> > > >> > On Sun, Jun 5, 2016 at 8:49 PM, Adrian Saul > >> > wrote: > >> > > > >> > > I upgraded my Infernalis semi-production cluster to Jewel on Friday. > >> > > While > >> > the upgrade went through smoothly (aside from a time wasting > >> > restorecon /var/lib/ceph in the selinux package upgrade) and the > >> > services continued running without interruption. However this > >> > morning when I went to create some new RBD images I am unable to do > >> > much at all > >> with RBD. > >> > > > >> > > Just about any rbd command fails with an I/O error. I can run > >> > showmapped but that is about it - anything like an ls, info or > >> > status fails. This applies to all my pools. > >> > > > >> > > I can see no errors in any log files that appear to suggest an > >> > > issue. I have > >> > also tried the commands on other cluster members that have not done > >> > anything with RBD before (I was wondering if perhaps the kernel rbd > >> > was pinning the old library version open or something) but the same > >> > erro
Re: [ceph-users] Jewel upgrade - rbd errors after upgrade
I couldn't find anything wrong with the packages and everything seemed installed ok. Once I restarted the OSDs the directory issue went away but the error started moving to other rbd output, and the same class open error occurred on other OSDs. I have gone through and bounced all the OSDs and that seems to have cleared the issue. I am guessing that perhaps the restart of the OSDs during the package upgrade is occurring before all library packages are upgraded and so they are starting with the wrong versions loaded, so when these class libraries are dynamically opened later they are failing. > -Original Message- > From: Adrian Saul > Sent: Monday, 6 June 2016 12:29 PM > To: Adrian Saul; dilla...@redhat.com > Cc: ceph-users@lists.ceph.com > Subject: RE: [ceph-users] Jewel upgrade - rbd errors after upgrade > > > I have traced it back to an OSD giving this error: > > 2016-06-06 12:18:14.315573 7fd714679700 -1 osd.20 23623 class rbd open got > (5) Input/output error > 2016-06-06 12:19:49.835227 7fd714679700 0 _load_class could not open class > /usr/lib64/rados-classes/libcls_rbd.so (dlopen failed): /usr/lib64/rados- > classes/libcls_rbd.so: undefined symbol: > _ZN4ceph6buffer4list8iteratorC1EPS1_j > > Trying to figure out why that is the case. > > > > -Original Message- > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf > > Of Adrian Saul > > Sent: Monday, 6 June 2016 11:11 AM > > To: dilla...@redhat.com > > Cc: ceph-users@lists.ceph.com > > Subject: Re: [ceph-users] Jewel upgrade - rbd errors after upgrade > > > > > > No - it throws a usage error - if I add a file argument after it works: > > > > [root@ceph-glb-fec-02 ceph]# rados -p glebe-sata get rbd_id.hypervtst- > > lun04 /tmp/crap > > [root@ceph-glb-fec-02 ceph]# cat /tmp/crap 109eb01f5f89de > > > > stat works: > > > > [root@ceph-glb-fec-02 ceph]# rados -p glebe-sata stat > > rbd_id.hypervtst- > > lun04 > > glebe-sata/rbd_id.hypervtst-lun04 mtime 2016-06-06 10:55:08.00, > > size 18 > > > > > > I can do a rados ls: > > > > [root@ceph-glb-fec-02 ceph]# rados ls -p glebe-sata|grep rbd_id > > rbd_id.cloud2sql-lun01 > > rbd_id.glbcluster3-vm17 > > rbd_id.holder <<< a create that said it failed while I was debugging this > > rbd_id.pvtcloud-nfs01 > > rbd_id.hypervtst-lun05 > > rbd_id.test02 > > rbd_id.cloud2sql-lun02 > > rbd_id.fiotest2 > > rbd_id.radmast02-lun04 > > rbd_id.hypervtst-lun04 > > rbd_id.cloud2fs-lun00 > > rbd_id.radmast02-lun03 > > rbd_id.hypervtst-lun00 > > rbd_id.cloud2sql-lun00 > > rbd_id.radmast02-lun02 > > > > > > > -Original Message- > > > From: Jason Dillaman [mailto:jdill...@redhat.com] > > > Sent: Monday, 6 June 2016 11:00 AM > > > To: Adrian Saul > > > Cc: ceph-users@lists.ceph.com > > > Subject: Re: [ceph-users] Jewel upgrade - rbd errors after upgrade > > > > > > Are you able to successfully run the following command successfully? > > > > > > rados -p glebe-sata get rbd_id.hypervtst-lun04 > > > > > > > > > > > > On Sun, Jun 5, 2016 at 8:49 PM, Adrian Saul > > > wrote: > > > > > > > > I upgraded my Infernalis semi-production cluster to Jewel on Friday. > > > > While > > > the upgrade went through smoothly (aside from a time wasting > > > restorecon /var/lib/ceph in the selinux package upgrade) and the > > > services continued running without interruption. However this > > > morning when I went to create some new RBD images I am unable to do > > > much at all > > with RBD. > > > > > > > > Just about any rbd command fails with an I/O error. I can run > > > showmapped but that is about it - anything like an ls, info or > > > status fails. This applies to all my pools. > > > > > > > > I can see no errors in any log files that appear to suggest an > > > > issue. I have > > > also tried the commands on other cluster members that have not done > > > anything with RBD before (I was wondering if perhaps the kernel rbd > > > was pinning the old library version open or something) but the same > > > error > > occurs. > > > > > > > > Where can I start trying to resolve this? > > > > > > > > Cheers, > > > > Adrian > > > > > > > > > > > > [root@ceph-glb-fec-01 ceph]# rbd ls glebe-sata >
Re: [ceph-users] Jewel upgrade - rbd errors after upgrade
Odd -- sounds like you might have Jewel and Infernalis class objects and OSDs intermixed. I would double-check your installation and see if your configuration has any overload for "osd class dir". On Sun, Jun 5, 2016 at 10:28 PM, Adrian Saul wrote: > > I have traced it back to an OSD giving this error: > > 2016-06-06 12:18:14.315573 7fd714679700 -1 osd.20 23623 class rbd open got > (5) Input/output error > 2016-06-06 12:19:49.835227 7fd714679700 0 _load_class could not open class > /usr/lib64/rados-classes/libcls_rbd.so (dlopen failed): > /usr/lib64/rados-classes/libcls_rbd.so: undefined symbol: > _ZN4ceph6buffer4list8iteratorC1EPS1_j > > Trying to figure out why that is the case. > > >> -Original Message- >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of >> Adrian Saul >> Sent: Monday, 6 June 2016 11:11 AM >> To: dilla...@redhat.com >> Cc: ceph-users@lists.ceph.com >> Subject: Re: [ceph-users] Jewel upgrade - rbd errors after upgrade >> >> >> No - it throws a usage error - if I add a file argument after it works: >> >> [root@ceph-glb-fec-02 ceph]# rados -p glebe-sata get rbd_id.hypervtst- >> lun04 /tmp/crap >> [root@ceph-glb-fec-02 ceph]# cat /tmp/crap 109eb01f5f89de >> >> stat works: >> >> [root@ceph-glb-fec-02 ceph]# rados -p glebe-sata stat rbd_id.hypervtst- >> lun04 >> glebe-sata/rbd_id.hypervtst-lun04 mtime 2016-06-06 10:55:08.00, size 18 >> >> >> I can do a rados ls: >> >> [root@ceph-glb-fec-02 ceph]# rados ls -p glebe-sata|grep rbd_id >> rbd_id.cloud2sql-lun01 >> rbd_id.glbcluster3-vm17 >> rbd_id.holder <<< a create that said it failed while I was debugging this >> rbd_id.pvtcloud-nfs01 >> rbd_id.hypervtst-lun05 >> rbd_id.test02 >> rbd_id.cloud2sql-lun02 >> rbd_id.fiotest2 >> rbd_id.radmast02-lun04 >> rbd_id.hypervtst-lun04 >> rbd_id.cloud2fs-lun00 >> rbd_id.radmast02-lun03 >> rbd_id.hypervtst-lun00 >> rbd_id.cloud2sql-lun00 >> rbd_id.radmast02-lun02 >> >> >> > -Original Message- >> > From: Jason Dillaman [mailto:jdill...@redhat.com] >> > Sent: Monday, 6 June 2016 11:00 AM >> > To: Adrian Saul >> > Cc: ceph-users@lists.ceph.com >> > Subject: Re: [ceph-users] Jewel upgrade - rbd errors after upgrade >> > >> > Are you able to successfully run the following command successfully? >> > >> > rados -p glebe-sata get rbd_id.hypervtst-lun04 >> > >> > >> > >> > On Sun, Jun 5, 2016 at 8:49 PM, Adrian Saul >> > wrote: >> > > >> > > I upgraded my Infernalis semi-production cluster to Jewel on Friday. >> > > While >> > the upgrade went through smoothly (aside from a time wasting >> > restorecon /var/lib/ceph in the selinux package upgrade) and the >> > services continued running without interruption. However this morning >> > when I went to create some new RBD images I am unable to do much at all >> with RBD. >> > > >> > > Just about any rbd command fails with an I/O error. I can run >> > showmapped but that is about it - anything like an ls, info or status >> > fails. This applies to all my pools. >> > > >> > > I can see no errors in any log files that appear to suggest an >> > > issue. I have >> > also tried the commands on other cluster members that have not done >> > anything with RBD before (I was wondering if perhaps the kernel rbd >> > was pinning the old library version open or something) but the same error >> occurs. >> > > >> > > Where can I start trying to resolve this? >> > > >> > > Cheers, >> > > Adrian >> > > >> > > >> > > [root@ceph-glb-fec-01 ceph]# rbd ls glebe-sata >> > > rbd: list: (5) Input/output error >> > > 2016-06-06 10:41:31.792720 7f53c06a2d80 -1 librbd: error listing >> > > image in directory: (5) Input/output error >> > > 2016-06-06 10:41:31.792749 7f53c06a2d80 -1 librbd: error listing v2 >> > > images: (5) Input/output error >> > > >> > > [root@ceph-glb-fec-01 ceph]# rbd ls glebe-ssd >> > > rbd: list: (5) Input/output error >> > > 2016-06-06 10:41:33.956648 7f90de663d80 -1 librbd: error listing >> > > image in directory: (5) Input/output error >> > > 2016-06-06 10:41:33.956672 7f90de663d80 -1 librbd: error listing v2 >> > > images: (5) Input/output error
Re: [ceph-users] Jewel upgrade - rbd errors after upgrade
I have traced it back to an OSD giving this error: 2016-06-06 12:18:14.315573 7fd714679700 -1 osd.20 23623 class rbd open got (5) Input/output error 2016-06-06 12:19:49.835227 7fd714679700 0 _load_class could not open class /usr/lib64/rados-classes/libcls_rbd.so (dlopen failed): /usr/lib64/rados-classes/libcls_rbd.so: undefined symbol: _ZN4ceph6buffer4list8iteratorC1EPS1_j Trying to figure out why that is the case. > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Adrian Saul > Sent: Monday, 6 June 2016 11:11 AM > To: dilla...@redhat.com > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] Jewel upgrade - rbd errors after upgrade > > > No - it throws a usage error - if I add a file argument after it works: > > [root@ceph-glb-fec-02 ceph]# rados -p glebe-sata get rbd_id.hypervtst- > lun04 /tmp/crap > [root@ceph-glb-fec-02 ceph]# cat /tmp/crap 109eb01f5f89de > > stat works: > > [root@ceph-glb-fec-02 ceph]# rados -p glebe-sata stat rbd_id.hypervtst- > lun04 > glebe-sata/rbd_id.hypervtst-lun04 mtime 2016-06-06 10:55:08.00, size 18 > > > I can do a rados ls: > > [root@ceph-glb-fec-02 ceph]# rados ls -p glebe-sata|grep rbd_id > rbd_id.cloud2sql-lun01 > rbd_id.glbcluster3-vm17 > rbd_id.holder <<< a create that said it failed while I was debugging this > rbd_id.pvtcloud-nfs01 > rbd_id.hypervtst-lun05 > rbd_id.test02 > rbd_id.cloud2sql-lun02 > rbd_id.fiotest2 > rbd_id.radmast02-lun04 > rbd_id.hypervtst-lun04 > rbd_id.cloud2fs-lun00 > rbd_id.radmast02-lun03 > rbd_id.hypervtst-lun00 > rbd_id.cloud2sql-lun00 > rbd_id.radmast02-lun02 > > > > -Original Message- > > From: Jason Dillaman [mailto:jdill...@redhat.com] > > Sent: Monday, 6 June 2016 11:00 AM > > To: Adrian Saul > > Cc: ceph-users@lists.ceph.com > > Subject: Re: [ceph-users] Jewel upgrade - rbd errors after upgrade > > > > Are you able to successfully run the following command successfully? > > > > rados -p glebe-sata get rbd_id.hypervtst-lun04 > > > > > > > > On Sun, Jun 5, 2016 at 8:49 PM, Adrian Saul > > wrote: > > > > > > I upgraded my Infernalis semi-production cluster to Jewel on Friday. > > > While > > the upgrade went through smoothly (aside from a time wasting > > restorecon /var/lib/ceph in the selinux package upgrade) and the > > services continued running without interruption. However this morning > > when I went to create some new RBD images I am unable to do much at all > with RBD. > > > > > > Just about any rbd command fails with an I/O error. I can run > > showmapped but that is about it - anything like an ls, info or status > > fails. This applies to all my pools. > > > > > > I can see no errors in any log files that appear to suggest an > > > issue. I have > > also tried the commands on other cluster members that have not done > > anything with RBD before (I was wondering if perhaps the kernel rbd > > was pinning the old library version open or something) but the same error > occurs. > > > > > > Where can I start trying to resolve this? > > > > > > Cheers, > > > Adrian > > > > > > > > > [root@ceph-glb-fec-01 ceph]# rbd ls glebe-sata > > > rbd: list: (5) Input/output error > > > 2016-06-06 10:41:31.792720 7f53c06a2d80 -1 librbd: error listing > > > image in directory: (5) Input/output error > > > 2016-06-06 10:41:31.792749 7f53c06a2d80 -1 librbd: error listing v2 > > > images: (5) Input/output error > > > > > > [root@ceph-glb-fec-01 ceph]# rbd ls glebe-ssd > > > rbd: list: (5) Input/output error > > > 2016-06-06 10:41:33.956648 7f90de663d80 -1 librbd: error listing > > > image in directory: (5) Input/output error > > > 2016-06-06 10:41:33.956672 7f90de663d80 -1 librbd: error listing v2 > > > images: (5) Input/output error > > > > > > [root@ceph-glb-fec-02 ~]# rbd showmapped > > > id pool image snap device > > > 0 glebe-sata test02-/dev/rbd0 > > > 1 glebe-ssd zfstest -/dev/rbd1 > > > 10 glebe-sata hypervtst-lun00 -/dev/rbd10 > > > 11 glebe-sata hypervtst-lun02 -/dev/rbd11 > > > 12 glebe-sata hypervtst-lun03 -/dev/rbd12 > > > 13 glebe-ssd nspprd01_lun00-/dev/rbd13 > > > 14 glebe-sata cirrux-nfs01 -/dev/rbd14 > > > 15 glebe-sata hypervtst-lun04 -/dev/rbd15 > > > 16 glebe-sata hype
Re: [ceph-users] Jewel upgrade - rbd errors after upgrade
The rbd_directory object is empty -- all data is stored as omap key/value pairs which you can list via "rados listomapvals rbd_directory". What is the output when you run "rbd ls --debug-ms=1 glebe-sata" and "rbd info --debug-ms=1 glebe-sata/hypervtst-lun04"? I am interested in the lines that looks like the following: ** rbd ls ** 2016-06-05 22:22:54.816801 7f25d4e4d1c0 1 -- 127.0.0.1:0/2033136975 --> 127.0.0.1:6800/29402 -- osd_op(client.4111.0:2 0.30a98c1c rbd_directory [call rbd.dir_list] snapc 0=[] ack+read+known_if_redirected e7) v7 -- ?+0 0x5598b0459410 con 0x5598b04580d0 2016-06-05 22:22:54.817396 7f25b8207700 1 -- 127.0.0.1:0/2033136975 <== osd.0 127.0.0.1:6800/29402 2 osd_op_reply(2 rbd_directory [call] v0'0 uv1 ondisk = 0) v7 133+0+27 (2231830616 0 2896097477) 0x7f258c000a20 con 0x5598b04580d0foo ** rbd info ** 2016-06-05 22:25:54.534064 7fab3cff9700 1 -- 127.0.0.1:0/951637948 --> 127.0.0.1:6800/29402 -- osd_op(client.4112.0:2 0.6a181655 rbd_id.foo [call rbd.get_id] snapc 0=[] ack+read+known_if_redirected e7) v7 -- ?+0 0x7fab180020a0 con 0x55e833b5e520 2016-06-05 22:25:54.534434 7fab4c589700 1 -- 127.0.0.1:0/951637948 <== osd.0 127.0.0.1:6800/29402 2 osd_op_reply(2 rbd_id.foo [call] v0'0 uv2 ondisk = 0) v7 130+0+16 (2464064221 0 855464132) 0x7fab24000b40 con 0x55e833b5e520 I suspect you are having issues with executing OSD class methods for some reason (like rbd.dir_list against rbd_directory and rbd.get_id against rbd_id.). On Sun, Jun 5, 2016 at 9:16 PM, Adrian Saul wrote: > > Seems like my rbd_directory is empty for some reason: > > [root@ceph-glb-fec-02 ceph]# rados get -p glebe-sata rbd_directory /tmp/dir > [root@ceph-glb-fec-02 ceph]# strings /tmp/dir > [root@ceph-glb-fec-02 ceph]# ls -la /tmp/dir > -rw-r--r--. 1 root root 0 Jun 6 11:12 /tmp/dir > > [root@ceph-glb-fec-02 ceph]# rados stat -p glebe-sata rbd_directory > glebe-sata/rbd_directory mtime 2016-06-06 10:18:28.00, size 0 > > > >> -Original Message- >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of >> Adrian Saul >> Sent: Monday, 6 June 2016 11:11 AM >> To: dilla...@redhat.com >> Cc: ceph-users@lists.ceph.com >> Subject: Re: [ceph-users] Jewel upgrade - rbd errors after upgrade >> >> >> No - it throws a usage error - if I add a file argument after it works: >> >> [root@ceph-glb-fec-02 ceph]# rados -p glebe-sata get rbd_id.hypervtst- >> lun04 /tmp/crap >> [root@ceph-glb-fec-02 ceph]# cat /tmp/crap 109eb01f5f89de >> >> stat works: >> >> [root@ceph-glb-fec-02 ceph]# rados -p glebe-sata stat rbd_id.hypervtst- >> lun04 >> glebe-sata/rbd_id.hypervtst-lun04 mtime 2016-06-06 10:55:08.00, size 18 >> >> >> I can do a rados ls: >> >> [root@ceph-glb-fec-02 ceph]# rados ls -p glebe-sata|grep rbd_id >> rbd_id.cloud2sql-lun01 >> rbd_id.glbcluster3-vm17 >> rbd_id.holder <<< a create that said it failed while I was debugging this >> rbd_id.pvtcloud-nfs01 >> rbd_id.hypervtst-lun05 >> rbd_id.test02 >> rbd_id.cloud2sql-lun02 >> rbd_id.fiotest2 >> rbd_id.radmast02-lun04 >> rbd_id.hypervtst-lun04 >> rbd_id.cloud2fs-lun00 >> rbd_id.radmast02-lun03 >> rbd_id.hypervtst-lun00 >> rbd_id.cloud2sql-lun00 >> rbd_id.radmast02-lun02 >> >> >> > -Original Message- >> > From: Jason Dillaman [mailto:jdill...@redhat.com] >> > Sent: Monday, 6 June 2016 11:00 AM >> > To: Adrian Saul >> > Cc: ceph-users@lists.ceph.com >> > Subject: Re: [ceph-users] Jewel upgrade - rbd errors after upgrade >> > >> > Are you able to successfully run the following command successfully? >> > >> > rados -p glebe-sata get rbd_id.hypervtst-lun04 >> > >> > >> > >> > On Sun, Jun 5, 2016 at 8:49 PM, Adrian Saul >> > wrote: >> > > >> > > I upgraded my Infernalis semi-production cluster to Jewel on Friday. >> > > While >> > the upgrade went through smoothly (aside from a time wasting >> > restorecon /var/lib/ceph in the selinux package upgrade) and the >> > services continued running without interruption. However this morning >> > when I went to create some new RBD images I am unable to do much at all >> with RBD. >> > > >> > > Just about any rbd command fails with an I/O error. I can run >> > showmapped but that is about it - anything like an ls, info or status >> > fails. This applies to all my pools. >> > > >> > > I can see no errors in any log files that appear to suggest an >>
Re: [ceph-users] Jewel upgrade - rbd errors after upgrade
Seems like my rbd_directory is empty for some reason: [root@ceph-glb-fec-02 ceph]# rados get -p glebe-sata rbd_directory /tmp/dir [root@ceph-glb-fec-02 ceph]# strings /tmp/dir [root@ceph-glb-fec-02 ceph]# ls -la /tmp/dir -rw-r--r--. 1 root root 0 Jun 6 11:12 /tmp/dir [root@ceph-glb-fec-02 ceph]# rados stat -p glebe-sata rbd_directory glebe-sata/rbd_directory mtime 2016-06-06 10:18:28.00, size 0 > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Adrian Saul > Sent: Monday, 6 June 2016 11:11 AM > To: dilla...@redhat.com > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] Jewel upgrade - rbd errors after upgrade > > > No - it throws a usage error - if I add a file argument after it works: > > [root@ceph-glb-fec-02 ceph]# rados -p glebe-sata get rbd_id.hypervtst- > lun04 /tmp/crap > [root@ceph-glb-fec-02 ceph]# cat /tmp/crap 109eb01f5f89de > > stat works: > > [root@ceph-glb-fec-02 ceph]# rados -p glebe-sata stat rbd_id.hypervtst- > lun04 > glebe-sata/rbd_id.hypervtst-lun04 mtime 2016-06-06 10:55:08.00, size 18 > > > I can do a rados ls: > > [root@ceph-glb-fec-02 ceph]# rados ls -p glebe-sata|grep rbd_id > rbd_id.cloud2sql-lun01 > rbd_id.glbcluster3-vm17 > rbd_id.holder <<< a create that said it failed while I was debugging this > rbd_id.pvtcloud-nfs01 > rbd_id.hypervtst-lun05 > rbd_id.test02 > rbd_id.cloud2sql-lun02 > rbd_id.fiotest2 > rbd_id.radmast02-lun04 > rbd_id.hypervtst-lun04 > rbd_id.cloud2fs-lun00 > rbd_id.radmast02-lun03 > rbd_id.hypervtst-lun00 > rbd_id.cloud2sql-lun00 > rbd_id.radmast02-lun02 > > > > -Original Message- > > From: Jason Dillaman [mailto:jdill...@redhat.com] > > Sent: Monday, 6 June 2016 11:00 AM > > To: Adrian Saul > > Cc: ceph-users@lists.ceph.com > > Subject: Re: [ceph-users] Jewel upgrade - rbd errors after upgrade > > > > Are you able to successfully run the following command successfully? > > > > rados -p glebe-sata get rbd_id.hypervtst-lun04 > > > > > > > > On Sun, Jun 5, 2016 at 8:49 PM, Adrian Saul > > wrote: > > > > > > I upgraded my Infernalis semi-production cluster to Jewel on Friday. > > > While > > the upgrade went through smoothly (aside from a time wasting > > restorecon /var/lib/ceph in the selinux package upgrade) and the > > services continued running without interruption. However this morning > > when I went to create some new RBD images I am unable to do much at all > with RBD. > > > > > > Just about any rbd command fails with an I/O error. I can run > > showmapped but that is about it - anything like an ls, info or status > > fails. This applies to all my pools. > > > > > > I can see no errors in any log files that appear to suggest an > > > issue. I have > > also tried the commands on other cluster members that have not done > > anything with RBD before (I was wondering if perhaps the kernel rbd > > was pinning the old library version open or something) but the same error > occurs. > > > > > > Where can I start trying to resolve this? > > > > > > Cheers, > > > Adrian > > > > > > > > > [root@ceph-glb-fec-01 ceph]# rbd ls glebe-sata > > > rbd: list: (5) Input/output error > > > 2016-06-06 10:41:31.792720 7f53c06a2d80 -1 librbd: error listing > > > image in directory: (5) Input/output error > > > 2016-06-06 10:41:31.792749 7f53c06a2d80 -1 librbd: error listing v2 > > > images: (5) Input/output error > > > > > > [root@ceph-glb-fec-01 ceph]# rbd ls glebe-ssd > > > rbd: list: (5) Input/output error > > > 2016-06-06 10:41:33.956648 7f90de663d80 -1 librbd: error listing > > > image in directory: (5) Input/output error > > > 2016-06-06 10:41:33.956672 7f90de663d80 -1 librbd: error listing v2 > > > images: (5) Input/output error > > > > > > [root@ceph-glb-fec-02 ~]# rbd showmapped > > > id pool image snap device > > > 0 glebe-sata test02-/dev/rbd0 > > > 1 glebe-ssd zfstest -/dev/rbd1 > > > 10 glebe-sata hypervtst-lun00 -/dev/rbd10 > > > 11 glebe-sata hypervtst-lun02 -/dev/rbd11 > > > 12 glebe-sata hypervtst-lun03 -/dev/rbd12 > > > 13 glebe-ssd nspprd01_lun00-/dev/rbd13 > > > 14 glebe-sata cirrux-nfs01 -/dev/rbd14 > > > 15 glebe-sata hypervtst-lun04 -/dev/rbd15 > > > 16 glebe-sata hypervtst-lun05 -/dev/r
Re: [ceph-users] Jewel upgrade - rbd errors after upgrade
No - it throws a usage error - if I add a file argument after it works: [root@ceph-glb-fec-02 ceph]# rados -p glebe-sata get rbd_id.hypervtst-lun04 /tmp/crap [root@ceph-glb-fec-02 ceph]# cat /tmp/crap 109eb01f5f89de stat works: [root@ceph-glb-fec-02 ceph]# rados -p glebe-sata stat rbd_id.hypervtst-lun04 glebe-sata/rbd_id.hypervtst-lun04 mtime 2016-06-06 10:55:08.00, size 18 I can do a rados ls: [root@ceph-glb-fec-02 ceph]# rados ls -p glebe-sata|grep rbd_id rbd_id.cloud2sql-lun01 rbd_id.glbcluster3-vm17 rbd_id.holder <<< a create that said it failed while I was debugging this rbd_id.pvtcloud-nfs01 rbd_id.hypervtst-lun05 rbd_id.test02 rbd_id.cloud2sql-lun02 rbd_id.fiotest2 rbd_id.radmast02-lun04 rbd_id.hypervtst-lun04 rbd_id.cloud2fs-lun00 rbd_id.radmast02-lun03 rbd_id.hypervtst-lun00 rbd_id.cloud2sql-lun00 rbd_id.radmast02-lun02 > -Original Message- > From: Jason Dillaman [mailto:jdill...@redhat.com] > Sent: Monday, 6 June 2016 11:00 AM > To: Adrian Saul > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] Jewel upgrade - rbd errors after upgrade > > Are you able to successfully run the following command successfully? > > rados -p glebe-sata get rbd_id.hypervtst-lun04 > > > > On Sun, Jun 5, 2016 at 8:49 PM, Adrian Saul > wrote: > > > > I upgraded my Infernalis semi-production cluster to Jewel on Friday. While > the upgrade went through smoothly (aside from a time wasting restorecon > /var/lib/ceph in the selinux package upgrade) and the services continued > running without interruption. However this morning when I went to create > some new RBD images I am unable to do much at all with RBD. > > > > Just about any rbd command fails with an I/O error. I can run > showmapped but that is about it - anything like an ls, info or status fails. > This > applies to all my pools. > > > > I can see no errors in any log files that appear to suggest an issue. I > > have > also tried the commands on other cluster members that have not done > anything with RBD before (I was wondering if perhaps the kernel rbd was > pinning the old library version open or something) but the same error occurs. > > > > Where can I start trying to resolve this? > > > > Cheers, > > Adrian > > > > > > [root@ceph-glb-fec-01 ceph]# rbd ls glebe-sata > > rbd: list: (5) Input/output error > > 2016-06-06 10:41:31.792720 7f53c06a2d80 -1 librbd: error listing image > > in directory: (5) Input/output error > > 2016-06-06 10:41:31.792749 7f53c06a2d80 -1 librbd: error listing v2 > > images: (5) Input/output error > > > > [root@ceph-glb-fec-01 ceph]# rbd ls glebe-ssd > > rbd: list: (5) Input/output error > > 2016-06-06 10:41:33.956648 7f90de663d80 -1 librbd: error listing image > > in directory: (5) Input/output error > > 2016-06-06 10:41:33.956672 7f90de663d80 -1 librbd: error listing v2 > > images: (5) Input/output error > > > > [root@ceph-glb-fec-02 ~]# rbd showmapped > > id pool image snap device > > 0 glebe-sata test02-/dev/rbd0 > > 1 glebe-ssd zfstest -/dev/rbd1 > > 10 glebe-sata hypervtst-lun00 -/dev/rbd10 > > 11 glebe-sata hypervtst-lun02 -/dev/rbd11 > > 12 glebe-sata hypervtst-lun03 -/dev/rbd12 > > 13 glebe-ssd nspprd01_lun00-/dev/rbd13 > > 14 glebe-sata cirrux-nfs01 -/dev/rbd14 > > 15 glebe-sata hypervtst-lun04 -/dev/rbd15 > > 16 glebe-sata hypervtst-lun05 -/dev/rbd16 > > 17 glebe-sata pvtcloud-nfs01-/dev/rbd17 > > 18 glebe-sata cloud2sql-lun00 -/dev/rbd18 > > 19 glebe-sata cloud2sql-lun01 -/dev/rbd19 > > 2 glebe-sata radmast02-lun00 -/dev/rbd2 > > 20 glebe-sata cloud2sql-lun02 -/dev/rbd20 > > 21 glebe-sata cloud2fs-lun00-/dev/rbd21 > > 22 glebe-sata cloud2fs-lun01-/dev/rbd22 > > 3 glebe-sata radmast02-lun01 -/dev/rbd3 > > 4 glebe-sata radmast02-lun02 -/dev/rbd4 > > 5 glebe-sata radmast02-lun03 -/dev/rbd5 > > 6 glebe-sata radmast02-lun04 -/dev/rbd6 > > 7 glebe-ssd sybase_iquser02_lun00 -/dev/rbd7 > > 8 glebe-ssd sybase_iquser03_lun00 -/dev/rbd8 > > 9 glebe-ssd sybase_iquser04_lun00 -/dev/rbd9 > > > > [root@ceph-glb-fec-02 ~]# rbd status glebe-sata/hypervtst-lun04 > > 2016-06-06 10:47:30.221453 7fc0030dc700 -1 librbd::image::OpenRequest: > > failed to retrieve image id: (5) Input/output error > > 2016-06-06 10:47:30.221556 7fc0028db700 -1 librbd::ImageState: failed > > to open image: (5) Input/o
Re: [ceph-users] Jewel upgrade - rbd errors after upgrade
Are you able to successfully run the following command successfully? rados -p glebe-sata get rbd_id.hypervtst-lun04 On Sun, Jun 5, 2016 at 8:49 PM, Adrian Saul wrote: > > I upgraded my Infernalis semi-production cluster to Jewel on Friday. While > the upgrade went through smoothly (aside from a time wasting restorecon > /var/lib/ceph in the selinux package upgrade) and the services continued > running without interruption. However this morning when I went to create > some new RBD images I am unable to do much at all with RBD. > > Just about any rbd command fails with an I/O error. I can run showmapped > but that is about it - anything like an ls, info or status fails. This > applies to all my pools. > > I can see no errors in any log files that appear to suggest an issue. I > have also tried the commands on other cluster members that have not done > anything with RBD before (I was wondering if perhaps the kernel rbd was > pinning the old library version open or something) but the same error occurs. > > Where can I start trying to resolve this? > > Cheers, > Adrian > > > [root@ceph-glb-fec-01 ceph]# rbd ls glebe-sata > rbd: list: (5) Input/output error > 2016-06-06 10:41:31.792720 7f53c06a2d80 -1 librbd: error listing image in > directory: (5) Input/output error > 2016-06-06 10:41:31.792749 7f53c06a2d80 -1 librbd: error listing v2 images: > (5) Input/output error > > [root@ceph-glb-fec-01 ceph]# rbd ls glebe-ssd > rbd: list: (5) Input/output error > 2016-06-06 10:41:33.956648 7f90de663d80 -1 librbd: error listing image in > directory: (5) Input/output error > 2016-06-06 10:41:33.956672 7f90de663d80 -1 librbd: error listing v2 images: > (5) Input/output error > > [root@ceph-glb-fec-02 ~]# rbd showmapped > id pool image snap device > 0 glebe-sata test02-/dev/rbd0 > 1 glebe-ssd zfstest -/dev/rbd1 > 10 glebe-sata hypervtst-lun00 -/dev/rbd10 > 11 glebe-sata hypervtst-lun02 -/dev/rbd11 > 12 glebe-sata hypervtst-lun03 -/dev/rbd12 > 13 glebe-ssd nspprd01_lun00-/dev/rbd13 > 14 glebe-sata cirrux-nfs01 -/dev/rbd14 > 15 glebe-sata hypervtst-lun04 -/dev/rbd15 > 16 glebe-sata hypervtst-lun05 -/dev/rbd16 > 17 glebe-sata pvtcloud-nfs01-/dev/rbd17 > 18 glebe-sata cloud2sql-lun00 -/dev/rbd18 > 19 glebe-sata cloud2sql-lun01 -/dev/rbd19 > 2 glebe-sata radmast02-lun00 -/dev/rbd2 > 20 glebe-sata cloud2sql-lun02 -/dev/rbd20 > 21 glebe-sata cloud2fs-lun00-/dev/rbd21 > 22 glebe-sata cloud2fs-lun01-/dev/rbd22 > 3 glebe-sata radmast02-lun01 -/dev/rbd3 > 4 glebe-sata radmast02-lun02 -/dev/rbd4 > 5 glebe-sata radmast02-lun03 -/dev/rbd5 > 6 glebe-sata radmast02-lun04 -/dev/rbd6 > 7 glebe-ssd sybase_iquser02_lun00 -/dev/rbd7 > 8 glebe-ssd sybase_iquser03_lun00 -/dev/rbd8 > 9 glebe-ssd sybase_iquser04_lun00 -/dev/rbd9 > > [root@ceph-glb-fec-02 ~]# rbd status glebe-sata/hypervtst-lun04 > 2016-06-06 10:47:30.221453 7fc0030dc700 -1 librbd::image::OpenRequest: failed > to retrieve image id: (5) Input/output error > 2016-06-06 10:47:30.221556 7fc0028db700 -1 librbd::ImageState: failed to open > image: (5) Input/output error > rbd: error opening image hypervtst-lun04: (5) Input/output error > Confidentiality: This email and any attachments are confidential and may be > subject to copyright, legal or some other professional privilege. They are > intended solely for the attention and use of the named addressee(s). They may > only be copied, distributed or disclosed with the consent of the copyright > owner. If you have received this email by mistake or by breach of the > confidentiality clause, please notify the sender immediately by return email > and delete or destroy all copies of the email. Any confidentiality, privilege > or copyright is not waived or lost because this email has been sent to you by > mistake. > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Jason ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com